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(54) Cellulose synthase gene 

(57) mRNA was extracted at the stage for cotton 
plant fibrous cells to accumulate cellulose, and cDNA's 
complementary thereto were synthesized to construct a 
cDNA library. Clones of a number of 750 were arbitrarily 
selected from the library, and they were randomly sub- 
jected from to sequencing. Those having homology to 



an amino acid sequence deduced Irom a gene of cellu- 
lose 4-|}-glucosyltransferase (bcsA ) of cellulose syn- 
thase operon of acetic acid bacterium were selected 
from obtained nucleotide sequences of the respective 
clones. Thus, DNA coding for cellulose synthase was 
obtained. 
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Description 

Technical Field 

5 The present invention relates to a DNA coding for cellulose synthase originating from cotton plant (Gossvpium 

hirsutum), a recombinant DNA containing the same, a transformed cell transformed with the DNA, and a method for 
controlling cellular cellulose synthesis. 

Background Art 

10 

Cellulose is used for paper, woody structural materials, fiber, cloths, food, cosmetics, and pharmaceuticals, as well 
as it is utilized as energy. Therefore, cellulose is industrially useful and valuable. Cellulose is capable of forming a 
variety of crystalline structures, and hence it is expected to develop a new material by controlling enzymes involved 
in biosynthesis of cellulose. The cellulose-related industry has been hitherto directed to such cellulose products that 

*s have been already produced, in which there has been no trial to develop a new material based on an aspect of bio- 
synthesis. The mechanism of disease action . which is exerted by pathogenic microorganisms on plants, often results 
from the inhibition on cellulose biosynthesis as in Pyricularia orvzae (R orvzae). Therefore, the addition ol disease 
resistance to the cellulose biosynthesis mechanism is agriculturally applicable and valuable. Further, cellulose is the 
most abundant organic compound on the earth, and it is a sink in which the largest amount of CC^ in the atmospheric 

20 air is fixed. Therefore, the genetic improvemenl of cellulose biosynthesis enzymes is also applicable to the industry 
which is directed to the control of CO2 in the atmospheric air based on the use of cellulose as the sink. 

In recent years, cDNA's originating from fiber cells of cotton plant have been randomly sequenced, and it has been 
reported that full length CetA1 and partial length of CelA2 probably represent cDNAs of cotton plant cellulose synthase, 
in view of the homology to bacterial cellulose synthase gene (bacterial BcsA) (Pear et al., Proceeding of National 

2s Academy of Science, USA (1996) 93 12637-12642). The binding ability to UDP-glucose has been demonstrated for 
CelA1 . However, as for CelA2, the homology has been merely demonstrated for the C-terminal amino acid sequence. 

Disclosure of the Invention 

30 The present invention has been made in order to provide a new method for regulating cellulose production in 

prokaryotic ceils or eukaryotic cells, an object of which is to provide a DNA coding for cellulose synthase, a recombinant 
DNA containing the same, a transformed cell transformed with the DNA, and a method for regulating cellular cellulose 
synthesis. 

The present inventors firstly extracted mRNAs at the stage for cotton plant fiber ceils to accumulate cellulose, and 
35 cDNAs complementary thereto were synthesized to construct a cDNA library. 750 of cDNA clones were arbitrarily 
selected from the library, and they were randomly subjected to sequencing. Six amino acid sequences were derived 
for one nucleotide sequence of each of the obtained clones to select those having homology to an amino acid sequence 
obtained by translation from a gene of cellulose 4-p-glucosy (transferase (bcsA) of cellulose synthase operon of aceto- 
bacterium. As a result, genes, which were classified into three types or groups, were found, and they were designated 
40 as PcsA1 , PcsA2, and PcsA3 respectively (PcsA is an abbreviation of "Plant Cellulose Synthase A°). 

That is, the present invention lies in a DNA coding for any one of the following proteins (A) to (C): 

(A) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID NO: 
2 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino acids 

is relevant to SEQ ID NO: 2; 

(B) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID NO: 
4 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino acids 
relevant to SEQ ID NO: 4; and 

(C) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID NO: 
so 8 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino acids 

relevant to SEQ ID NO: 8, and comprising an amino acid sequence shown in SEQ ID NO: 11 or an amino acid 
sequence involving deletion, substitution, insertion, or addition of one or several amino acids relevant to SEQ ID 
NO: 11. 

55 In another aspect, the present invention provides a recombinant vector comprising all or a part of the DNA as 

defined above, and a transformed cell transformed with the DNA as defined above. 

In still another aspect, the present invention provides a method for regulating cellulose synthesis in a cell, com- 
prising the steps of introducing the DNA as defined above into the cell, and expressing RNA having a nucleotide 
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sequence homologous to the DNA as defined above or a nucleotide sequence complementary to the DN A as defined 
above. 

SEQ ID NO: 1 corresponds to a sequence of PcsAI. and SEQ ID NO: 3 corresponds to a sequence of PcsA2. 
SEQ ID NO: 5 corresponds to a sequence of 3*-side region of PcsA3, SEQ ID NO: 7 corresponds to a sequence of 5'- 
side region of PcsA3, and SEQ ID NO: 9 corresponds to a sequence of internal region of PcsA3. 

It has been demonstrated that PcsAI and PcsA2 of the DNA's described above are DNA's coding for cotton plant 
cellulose synthase, according to the expression in eukaryotic cells (animal cells anoVbr yeast). It has been also dem- 
onstrated that an antibody thereagainst inhibits the cotton plant cellulose synthase activity in a cell-tree system. Further, 
PcsA3, which is different from PcsAI and PcsA2, has been found. Any one of these species was obtained as partial 
one, at the stage of clones obtained by the random sequencing., and no S'-portion of the coding region was contained. 
Therefore, clones which have sequences of S'-portions were isolated in accordance with the S'-RACE method based 
on the use of PGR to determine the sequences. As a result of this operation, the sequences of the S'-portions corre- 
sponding to the partial length clones were obtained for PcsAI and PcsA2. 

On the other hand, as for PcsA3, a sequence of a 5'-portion of another clone, which was considered to belong to 
the same PcsA3 group, was obtained. The both sequences had extremely high homology, and hence they were con- 
sidered to have underwent multiple gene formation relatively recently originating from an identical gene through the 
process of duplication. Therefore, even when the both are combined with each other at corresponding portions to 
construct a fused gene followed by expression, it is assumed that the activity and function of a produced enzyme may 
not be affected thereby. 

As for PcsAI and PcsA2, in order to obtain a full length clone, primers were designed on the basis of the sequence 
of the S'-portion and the sequence of the ^-portion of the partial length clone to perform PCR. Thus, a clone containing 
ORF was obtained. 

Those applicable as the template to be used for the RACE method may be any of cDNA synthesized from mRN A 
and a phage library. When the phage library is used, it is possible to use a sequence in the vector as a S'-side primer. 

As a result of random sequencing, seven clones concerning PcsA2 were most abundantly present, of 15 clones 
seemed to code the cellulose synthase. Expression was confirmed in eukaryotic cells (animal cells and/or yeast) trans- 
formed with the cellulose synthase gene. As a result, the cellulose synthase activity was observed. 

The present invention will be explained in detail below. 

<1> Preparation of cotton plant cDNA library 

Cotton plant fiber cells at the stage of cellulose accumulation are preferably used as a material for extracting mRNA 
to construct a cotton plant cDNA library. The method for extracting mRNA is not specifically limited, for which it is 
possible to adopt an ordinary method for extracting mRNA from plant. 

cDNA can be synthesized, for example, by using a poly T sequence which is complementary to poly A nucleotide 
existing at the terminal of mRNA as a primer to synthesize complementary DNA by the aid of reverse transcriptase, 
and forming a double strand by the aid of DNA polymerase. 

The method therefor is described, for example, in Molecular Cloning (Maniatis et al., Cold Spring Harbour Labo- 
ratory). However, a variety of cDNA synthesis kits are commercially available from various companies, which may be 
used. 

Generally, the library is constructed by using a phage vector. A variety of commercially available vectors are usable. 
However, it is preferable to use a vector, for example, XZAP vector in which it is unnecessary to perform recloning from 
the vector, and it is possible to immediately prepare a plasmid for sequencing. 

<2> Determination of nucleotide sequence of cDNA 

Clones are randomly selected from the obtained cDNA library to determine nucleotide sequences of inserts in the 
clones. The nucleotide sequence can be determined in accordance with the Maxam-Gilbert method or the dideoxy 
method. Among them, the dideoxy method is more convenient and preferred. 

The nucleotide sequence can be determined in accordance with the dideoxy method by using a commercially 
available sequencing kit. Further, the use of an automatic sequencer makes it possible to determine sequences of a 
large number of clones for a short period of time. 

It is unnecessary to determine the sequence for an entire length of the insert. It is enough to determine a length 
of nucleotide sequence which is considered to be sufficient to perform homology search. For example, in Examples 
described later on, the homology search as described below was performed when a sequence having not less than 
60 nucleotides was successfully determined. 
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<3> Homology search with gene data base 

The determined nucleotide sequence of each of cDNA clones is used to perform the homology search with respect 
to known amino acid sequences of the cellulose synthase or nucleotide sequences of genes coding therefor registered 

5 in the gene data base. The cellulose synthase is exemplified by an enzyme encoded by a gene of cellulose 4-0- 
glucosyttransferase (BcsA) of cellulose synthase operon of acetobacterium (Wong, H. C. et al., Proc. Natl. Acad. Sci. 
U.S.A. , 87, 8130-8134 (1990), ACCESSION No. M37202). 

Those usable as the data base include, for example, GenBank, EMBL, and DDBJ published, for example, from 
Los Alamos National Institute in the United Stales, Institute of European Molecular Biology, and National Institute of 

to Genetics (Japan). Those commercially available and useable as the program for homology search include, for example, 
commercially available DNA analysis softwares, such as DNASIS (Hitachi Software Engineering Co.Ltd.) and GENE- 
TYX (SDC Software Development). The following methods are also available. That is, a computer terminal is connected 
with the host computer of National Institute of Genetics to perform analysis. Alternatively, a personal computer is con- 
nected on Internet with NCBl (National Center for Biotechnology Information) to utilize (http7/www. ncbi.nlm.nih.gov/ 

is BLAST/) BLAST (Basic Local Alignment Search Tool) so that high speed homology search is performed. 

The homology search is performed, for example, in accordance with the following algorithm. When the homology 
search is performed for a nucleotide sequence, homology comparison is advanced while shifting the nucleotide se- 
quence to be investigated by every one nucleotide with respect to individual gene sequences included in the data base. 
When six or more continuous nucleotides are coincident, the homology score is counted and calculated in accordance 

20 with a homology score table (see, for example, M. Dayhoff, Atlas of Protein Sequence and Structure, vol. 5 (1978)). 
The system is set so that those having a score not less than a certain value are picked up as candidates which have 
homology. Further, the gap may be introduced into the sequence to be investigated or into the gene sequence included 
in the data base to make optimization so that the score is maximized. 

When the homology search is performed for an amino acid sequence, a nucleotide sequence to be investigated 

25 js converted into amino acids concerning all six frames including those of a complementary chain. The investigation 
may be performed in the same manner as performed for the nucleotide. Specifically, it is possible to use blastx of 
BLAST described above. As for detailed techniques and conditions for the search, reference may be made to DDBJ 
News Letter, No. 15 (February 1995). 

30 <4> Isolation of cDNA clone of cotton plant cellulose synthase 

The clone obtained as described above is not necessarily contain the entire nucleotide sequence of the gene. In 
such a case, the clone is used as a probe to perform screening by means of plaque hybridization. Thus, it is possible 
to obtain a clone containing a full length gene from the library. A specified method may be carried out with reference 
35 to Molecular Cloning, second edition (Maniatis et al., Cold Spring Harbour Laboratory) 12.30 to 12.40. 

When obtained cDNA is deficient in S'-portion, the S'-portion can be obtained as well by synthesizing primers so 
that the cDNA sequence may be elongated toward the S'-terminal, and performing RT-PCR by using mRNA as a 
template. 

As demonstrated in Examples described later on, the DNA of the present invention has been obtained as those 
40 having homology to the known bacterial cellulose synthase gene. The DNA further codes for an amino acid sequence 
G InXXXXXXArgTrp (SEQ ID NO: 1 2) which is considered to form a UDP-glucose binding domain, having high homology 
in the vicinity thereof. 

The nucleotide sequences of DNA of the present invention obtained as described above and the amino acid se- 
quences deduced from the nucleotide sequences are shown in SEQ ID NOs: 1 to 10 in Sequence Listing. SEQ ID 

*s NOs: 1 and 3 show nucleotide sequences of PcsAI and PcsA2 respectively. SEQ ID NOs: 2 and 4 show amino acid 
sequences deduced from the nucleotide sequences of PcsAI and PcsA2 respectively. 

SEQ ID NOs: 5 and 6 show a nucleotide sequence of a clone (PcsA3-682) containing 3*-side region of Pes A3 and 
an amino acid sequence deduced from the nucleotide sequence respectively. SEQ ID NOs: 7 and 8 show a nucleotide 
sequence of a 5'-portion (PcsA3-5') of another clone containing 5'-side region of PcsA3 and an amino acid sequence 

50 deduced from the nucleotide sequence respectively. SEQ ID NOs: 9 and 10 show a nucleotide sequence of 3 , -portion 
(PcsA3-3') of the clone and an amino acid sequence deduced from the nucleotide sequence respectively (see Fig. 1). 
That is, SEQ ID NO: 5 corresponds to the 3*-side region of PcsA3, SEQ ID NO: 7 corresponds to the 5' side region of 
PcsA3, and SEQ ID NO: 9 corresponds to internal region of PcsA3. The overlapping portion of PcsA3-682 is different 
from that of PcsA3-3' in 9 nucleotides in the nucleotide sequence and 1 amino acid in the amino acid sequence. Figs. 

5S 3 and 4 show the comparison between the nucleotide sequences of PcsA3-682 and Pes A3- 3*. SEQ ID NO: 11 shows 
a combination of the amino acid sequences encoded by PcsA3~682 and PcsA3-3\ 

The sequence of GlnXXXXXXArgTrp (SEQ ID NO 12) corresponds to amino acid numbers 710 to 714 in SEQ ID 
NO: 2 for PcsAI , amino acid numbers 778 to 782 in SEQ ID NO: 4 for PcsA2, and amino acid numbers 356 to 360 in 
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SEQ ID NO: 6 for Pes A3. 

PcsA1 is different from CelA1 reported by Pear et al. (Proceeding o f National Academy of Science, USA (1 996), 
93, 1 2637-12642) in nucleotide sequence by 28 nucleotides. As a result, the former is different from the latter in amino 
acid sequence encoded thereby by 10 amino acid residues. In general, the sugar chain specificity and the substrate 

5 specificity of the sugar chain transferase are extremely changed by point mutation of the nucleotide of DN A (Yamamoto 
and Hakomori. The Journal of Biological Chemistry f 1 990) 265, 19257-19262). Therefore, it is unclear whether or not 
CelA1 codes for a protein having the cellulose synthase activity Incidentally, the 48th Arg, the 56th Ser, the 81 st Asn, 
the 104th Ala, the 110th Ser, the 247th Asp, the 376th Asp, the 386th Ser, the 409th Arg, and the 649th Ser in the 
amino acid sequence encoded by CelAI correspond to Gin, lie, Ser, Thr, Pro, Asn, Glu, Pro, His, and Gly in PcsA1 

10 respectively. 

PcsA2 of the present invention contains the same sequence as that of CelA2 reported by Pear et al. However, 
CelA2 has an incomplete length, and it does not contain the entire coding region. CelA2 corresponds to nucleotide 
numbers of 1083 to 3311 in the nucleotide sequence of PcsA2 shown in SEQ ID NO: 3. 

Any of the amino acid sequences shown in SEQ ID NOs: 2, 4, 6, 8, 10, and 11 is a novel sequence. All genes 

is having nucleotide sequences coding for the amino acid sequences are included in the present invention. 

The amino acid sequences described above may include deletion, substitution, insertion, and/or addition of one 
or more amino acid residues provided that the characteristic of the gene of the present invention is not substantially 
aflected. The deletion, substitution, insertion, and/or addition of one or more amino acid residues as described above 
is obtainable by modifying the DNA's coding for the amino acid sequences shown in SEQ ID NOs: 2, 4, 6, 8, 10, and 

20 11 randomly in accordance with the ordinary mutation treatment or intentionally in accordance with the site<Jirected 
mutagenesis method. As described above, in general, the sugar chain specificity and the substrate specificity of the 
sugar chain transferase are extremely changed by point mutation of the nucleotide of DNA. Therefore, DNA coding 
for a protein having the cellulose synthase activity is selected from the modified DNA's. The cellulose synthase activity 
can be measured, for example, by means of the method described by T Hayashi: Measurinq-fr-glucan deposition in 

25 plant cell walls, in Modern Methods of Plant Analysis: Plant Fibers, eds. H. F. Linskens and J. R Jackson, Springer- 
Verlag, 10: 138-160 (1989). 

Those harboring proteins or genes partially different from the sequences shown in Sequence Listing may exist 
depending on, for example, the variety of cotton plant or natural mutation. However, such genes are also included in 
the gene of the present invention. Such a gene may be obtained as DNA which is hybridizabte under the stringent 

30 condition with all or a part of the coding region of the nucleotide sequence shown in SEQ ID NO: 1 , 3, 5, 7, or 9. The 
■stringent condition - referred to herein indicates a condition under which a so-called specific hybrid is formed, and non- 
specific hybrid is not formed. It is difficult to definitely express such a condition by using a numerical value. However, 
for example, the stringent condition is exemplified by a condition under which nucleic acids having high homology, for 
example, DNA's having homology of not less than 80 % undergo hybridization with each other, and nucleb acids having 

35 homology lower than the above do not undergo hybridization with each other. 

<5> Utilization of gene of the present invention 

The DNA of the present invention makes it possible to control the cellulose synthesis in prokaryotic cells such as 
40 acetobacterium and/or eukaryotic cells such as yeasts belonging to, for example, the genus Saccharomyces, cells of 
plant such as cotton plant, and cultured ceils of mammals and the like. 

Specifically, the cellulose synthesis in the cells as described above can be facilitated, for example, by connecting 
a promoter to an upstream region of the DNA of the present invention, inserting an obtained fragment into an appropriate 
vector to construct a recombinant vector, and introducing the vector into the cells. Alternatively, the cellulose synthesis 
45 in the cells can be suppressed by introducing an antisense gene of the DNA of the present invention into the cells. 

The promoter and the vector may be selected from those ordinarily utilized to express heterogeneous genes, and 
the method ordinarily employed to express heterogeneous genes may be used as the transformation method. Specif- 
ically, in the case of yeast, it is possible to use a protein-expressing kit produced by Invitrogen, i.e., Pichia Expression 
Kit, and a vector pPIC9 contained in this kit. For example, COS7 cells may be used as mammalian cultured cells, and 
so a vector CDM8 may be used therefor. 

The present invention provides the DNA coding for cellulose synthase. The DNA provides a new method for con- 
trolling cellulose production by incorporating the DNA into prokaryotic cells and eukaryotic cells. 

Brief Description of the Drawings 

55 

Fig. 1 shows a relationship between two clones of PcsA3 as an embodiment of the DNA of the present invention. 
Regions interposed between arrows indicate regions for which nucleotide sequences have been determined. A dotted 
line indicates a region for which no nucleotide sequence has been determined. 
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Fig. 2 shows a structure of EcoRI adapter 

Fig. 3 shows comparison between sequences of PcsA 3-682 and PcsA3-3' (former half). 
Fig. 4 shows comparison between sequences of PcsA3-682 and PcsA3-3* (latter half). °:° indicates coincident 
nucleotides, and °* D indicates non -coincident nucleotides. 

5 

Best Mode for Carrying Out the Invention 

Examples of the present invention will be explained below. 

10 <i > Preparation of total RNA from cotton plant 

Cotton plant (Gossvpium hirsutum L. ) Coker 31 2 was used as a material. Fiber cells on 1 6 to 1 8 days post anthesis 
were collected in liquid nitrogen. The cotton plant fiber cells in an amount of 75 g were sufficiently ground in a mortar 
while being frozen with liquid nitrogen. Powdered fiber was transferred to a centrifuge tube equipped with a cap, to 

75 which 375 mg of DTT as a powder was added, followed by addition of 200 ml of XT buffer (obtained by adjusting 0.2 
M sodium borate containing 30 mM EDTA and 1 % SDS to be pH 9.0, and then applying a diethylpyrocarbonate 
treatment, lollowed by autoclaving to obtain a solution to which vanadytribonucleoside was added to give a concen- 
tration of 10 mM) having been heated to 90 to 95 °C. An obtained solution was sufficiently agitated. 

The solution was added with 100 mg of protease K, and it was agitated again. The solution was incubated at 40 

20 °c for 2 hours, and then it was added with 16 ml of 2 M KCL The solution was sufficiently agitated again, and it was 
left to stationary stand in ice for 1 hour, followed by centrifugation for 20 minutes (4 °C) at 12,000 g by using a high 
speed refrigerated centrifuge. 

An obtained supernatant was filtrated, and floating matters were removed. The solution was transferred to a meas- 
uring cylinder to measure the volume. The solution was transferred to another centrifuge tube, to which lithium chloride 

25 was added in an amount of 85 mg per 1 ml of the extract solution to give a final concentration of 2 M. The solution was 
left to stationary stand at 4 °C overnight, and then precipitated RNA was separated by centrifugation for 20 minutes 
at 12,000 g. An obtained precipitate of RNA was washed and precipitated twice with cooled 2 M lithium chloride. 

The obtained RNA was dissolved in 10 mM Tris buffer (pH 7.5) to give a concentration of about 2 mg/ml, to which 
5 M potassium acetate was added to give a concentration of 200 mM. Ethanol was added thereto to give a concentration 

30 of 70 %, followed by cooling at -80 °C for 10 minutes. Centrifugation was performed at 4 °C for 10 minutes at 15,000 
rpm, and then an obtained precipitate was suspended in an appropriate amount of sterilized water to give an RNA 
sample. As a result of quantitative measurement for the RNA sample, total RNA was obtained in an amount of 2 mg. 

<2> Purification of mRNA 

35 

mRNA was purified as a poly(A) + RNA fraction from the total RNA obtained as described above. Purification was 
performed by using Oligotex-dT30 <Super> (purchased from Toyobo) as oligo(dT)-immobilized latex for poly(A) + RNA 
purification. 

Elution buffer (1 0 mM Tris-HCI (pH 7.5), 1 mM EDTA : 0. 1 % SDS) was added to a solution containing 1 mg of the 
40 total RNA to give a total volume of 1 ml, to which 1 ml of Oligotex-dT30 <Super> was added, followed by heating at 
65 °C for 5 minutes and quick cooling on ice for 3 minutes. The obtained solution was added with 0.2 ml of 5 M NaCI, 
and it was incubated at 37 °C for 10 minutes, followed by centrifugation at 15,000 rpm for 3 minutes. After that, a 
supernatant was carefully removed. 

An obtained pellet was suspended in 2.5 ml of Washing Buffer (10 mM Tris-HCI (pH 7.5), 1 mM EDTA, 0.5 M NaCI, 
45 o. 1 % SDS), and the suspension was centrif uged at 1 5,000 rpm for 3 minutes. After that, a supernatant was carefully 
removed. An obtained pellet was suspended in 1 ml of TE Buffer, and then it was heated at 65 °C 5 minutes. The 
suspension was quickly cooled on ice for 3 minutes, and then it was centrif uged at 15,000 rpm for 3 minutes to recover 
poly(A) + mRNA contained in an obtained supernatant. 

Thus, the poty(A) + mRNA in an amount of about 10 ug was obtained from 1 mg of the total RNA. An aliquot of 5 
50 ug thereof was used to prepare a cDNA library. 

<3> Preparation of cDNA library 

(1) Synthesis of cDNA 

55 

The mRNA obtained as described above was used as a template to synthesis cDNA by using a XZAP cDNA 
synthesis kit produced by Stratagene. The following solution was prepared and mixed in a tube. 
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5.0 pi 10 x 1st Strand Buffer (buffer for reverse transcription reaction); 
3.0 pi 10 mM 1st Strand Methyl Nucleotide Mix (5-methyl dCTR dATP, dGTP, dTTP mixture); 
2.0 pt Linker-Primer (linker and primer); 
H 2 0 (adjusted to give a total volume of 50 pi); 
5 1.0 pi RNase Block II (RNase inhibitor). 

The respective components described above were contents of the kit. Linker-Primer had a sequence as shown in 
SEQ ID NO: 13. Methylated nucleotide was used because it was intended not to allow cDNA to be digested by the 
restriction enzyme reaction performed later on. The reaction solution was agitated well, and then 5.0 ug of poly(A) + 
w mRNA was added thereto, followed by being left to stand at room temperature for 10 minutes. Further, 2.5 pl of M- 
MuLV RTase (reverse transcriptase) was added (at this time, the total volume was 50 ul). The reaction solution was 
gently mixed, followed by centrif ugation under a mild condition to allow the reaction solution to fall to the bottom of the 
tube. The reaction was performed at 37 °C for 60 minutes. 

Next, the following solution was prepared and mixed in the tube in a certain order. 

is 

45.0 pl reaction solution containing cDNA primary chain; 
40.0 pl 10 x 2nd Strand Buffer (buffer for polymerase reaction); 
6.0 pl 2nd Strand Nucleotide Mixture (A, G, C, T mixture); 
302.0 pl H 2 0. 

20 

The following solution was further added. However in order to allow RNase and DNA polymerase to simultaneously 
act, enzyme solutions were allowed to adhere to the wall of the tube. After that, a vortex treatment was promptly 
performed, and the reaction solutions were allowed to fall to the bottom of the tube by means of centrif ugation to 
perform a reaction for synthesizing cDNA second strand at 16 °C for 150 minutes. 

25 

0.8 pl RNase H (RNA-degrading enzyme); 
7.5 pl DNA polymerase I (10.0 u/pl). 

The reaction solution was added with 400 ul of a mixed solution of phenol: chloroform (1:1). Agitation was performed 
30 well, followed by centrifugation at room temperature for 2 minutes. An obtained supernatant was added with 400 pl of 
phenol: chloroform again, which was subjected to a vortex treatment and centrifugation at room temperature for 2 
minutes. An obtained supernatant was added with the following solution to precipitate cDNA. 

33.3 ul 3 M sodium acetate solution; 
35 867.0 pl 1 00 % ethanol. 

The obtained solution was left to stand at -20 °C overnight, and it was centrifuged at room temperature for 60 
minutes. After that, washing was gentry performed with 80 % ethanol, followed by centrifugation for 2 minutes. A su- 
pernatant was removed. An obtained pellet was dried, and it was dissolved in 43.5 pl of sterilized water. An aliquot 
40 (39.0 |il) was added with the lollowing solution to blunt-end cDNA terminals. 

5.0 til 1 0 x T4 DNA Polymerase Buffer (buffer for T4 polymerase reaction); 
2.5 pl 2.5 mM dNTP Mix (A, G, C, T mixture); 
3.5 pl T4 DNA polymerase (2.9 u/ul). 

45 

The reaction was performed at 37 °C for 30 minutes, to which 50 ul of distilled water was added, and then 1 00 pl 
of phenol: chloroform was added thereto, followed by a vortex treatment and centrifugation for 2 minutes. An obtained 
supernatant was added with 1 00 ul of chloroform, which was subjected to a vortex treatment, followed by centrifugation 
for 2 minutes. The supernatant was added with the following solution to precipitate cDNA. 

so 

7.0 pi 3 M sodium acetate solution; 
226 pi 100% ethanol. 

The solution was left to stand on ice for 30 minutes or more, and it was centrifuged at 4 °C for 60 minutes. An 
ss obtained precipitate was washed with 1 50 pi of 80 % ethanol, followed by centrifugation for 2 minutes and drying. The 
cDNA pellet was dissolved in 7.0 pi of EcoRI Adaptor solution, to which the following solution was added to ligate the 
EcoRI adapter to both ends of the cDNA. Sequences of respective strands of the EcoRI adapter are shown in SEQ ID 
NO: 14 and Fig. 2. 
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1.0 pi 10 x Ligation Buffer (buffer for ligase reaction); 
1.0 pi 1 0 mM ATP; 
1.0 \i\ T4 DNA ligase. 

5 The reaction solution was centrifuged under a mild condition, and it was left to stand at 4 °C overnight or more. 

The solution was treated at 70 °C for 30 minutes, and then it was centrifuged under a mild condition, followed by being 
left to stand at room temperature for 5 minutes. The reaction solution was added with the following solution to phos- 
phorylate ^-terminals of the EcoRI adapter. 

10 1.0 u.M 0 x Ligation Buffer (buffer for ligase reaction); 

2.0 Ml 10 mM ATP; 
6.0 Ml H a O; 

1.0 \i\ T4 polynucleotide kinase (10.0 u/pl). 

is The reaction was performed at 37 °C tor 30 minutes, followed by a treatment at 70 °C for 30 minutes. The solution 

was centrifuged under a mild condition, and it was left to stand at room temperature for 5 minutes. The following solution 
was further added thereto to perform a reaction at 37 "C for 90 minutes so that the Xhol site introduced by Linker- 
Primer was digested with Xho l, followed by being left to stand at room temperature to perform cooling. 

20 28.0 pi Xhol Buffer; 

3.0 Ml Xho l (45 u/mO- 

The reaction solution was added with 5.0 mI of 10 x STE (10 mM Tris-HCI (pH 8.0), 100 mM NaCI, 1 mM EDTA), 
which was added into a centrifuge column for removing short fragments (Sephacryl Spin Column) to perform centrif- 

25 ugation at 600 g for 2 minutes to obtain an eluent which was designated as Fraction 1. This operation was further 
repeated three times to obtain Fractions 2, 3, and 4 respectively. Fractions 3 and 4 were combined, to which phenol: 
chloroform (1:1) was added and agitated well, followed by centrif ugation at room temperature for 2 minutes. An obtained 
supernatant was added with an equal amount of chloroform, and an obtained mixture was agitated well. The mixture 
was centrifuged at room temperature for 2 minutes to obtain a supernatant to which a two-fold amount of 1 00 % ethanol 

30 was added, followed by being left to stand at - 20 °C overnight. The solution was centrifuged at 4 °C for 60 minutes, 
fol towed by washing with an equal amount of 80 % ethanol. The solution was centrifuged at 4 °C for 60 minutes to 
obtain a cDNA pellet which was suspended in 10 m' of sterilized water. 

(2) Preparation of c DNA library 

35 

The double strand cDNA obtained as described above was ligated with X phage expression vector to prepare a 
recombinant vector. The fol towing solution was prepared and mixed in a tube to perform a reaction at 1 2 °C overnight, 
followed by being left to stand at room temperature for 2 hours to ligate cDNA with the vector. 

^o 2.5 mI cDNA solution; 

0.5 mI 10 x Ligation Buffer; 

0.5 u-l10mMATP; 

1 .0 mI y^ZAP vector DNA (1 up/pl); 

0.5 Ml T4 DNA ligase (4 Weiss u/u.1). 

45 

(3) Packaging of phage DNA into phage particles 

The phage vector containing the cDNA was packaged into phage particles by using an in vitro packaging kit (Gi- 
gapack II Gold packaging extract: produced by Stratagene). The recombinant phage solution was added to Freeze/ 

50 Thaw extract immediately after dissolution, and the solution was placed on ice, to which 15 ul of Sonic extract was 
added to perform mixing well by pipetting. The reaction solution was centrifuged under a mild condition, and it was left 
to stand at room temperature (22 °C) for 2 hours. The reaction solution was added with 500 mI of Phage Dilution Buffer, 
to which 20 mI of chloroform was further added, followed by mixing. In order to measure the titer of the library, an aliquot 
(2 mO of 500 mI of the aqueous phase was diluted in a ratio of 1:10 with 18 m' of SM buffer (5.8 g of NaCI, 2 g of 

55 MgSo 4 °7H 2 0, 50 ml of 1 M Tris-HCI (pH 7.5), and 5 ml of 2 % gelatin in 1 L). The diluted solution (1 and the phage 
stock solution (1 mI) were plated respectively together with 200 pi of a culture solution of Escherichia coli PLK-F strain 
having been cultivated to arrive at a value of OD^ of 0.5. That is, Escherichia coli PLK-F' strain was mixed with the 
phage solution to perform cultivation at 37 °C for 15 minutes. The obtained culture was added to 2 to 3 ml of top agar 
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(48 °C), which was immediately overlaid on NZY agar plate having been warmed at 37 °C. Cultivation was performed 
overnight al 37 °C. and appeared plaques were counted to calculate the titer. As a result, the titer was 1 .2 x 10 6 pf u/ml. 

(4) Amplification of library 

5 

A centrifuge tube was added with the packaging solution containing about 50,000 recombinant bacteriophages 
and 600 ul of a culture solution of Escherichia coli PLK-F strain having been cultivated to have a value of OD 600 of 
0.5, followed by cultivation at 37 °C for 1 5 minutes. The culture solution was added with 6.5 ml of top agar having been 
maintained at 48 °C after dissolution, which was overlaid on 150 mm NZY plate having been warmed at about 37 °C, 

10 followed by cultivation at 37 *C for 5 to 8 hours. The respective plates were added with 10 ml of SM Buffer to perform 
cultivation at 4 °C overnight with gentle shaking. SM Buffer in the respective plates was collected in a sterilized poly- 
propylene tube. The respective plates were rinsed with 2 ml of SM Buffer, and the rinsing solutions were collected in 
the same tube. Chloroform in an amount corresponding to 5 % of the total amount was added and mixed, followed by 
being left to stand at room temperature for 15 minutes. Bacterial cells were removed by centrifugation at 4,000 g for 

is 5 minutes. An obtained supernatant was added with chloroform in an amount corresponding to 0.3 % of the total 
amount, and it was stored at 4 °C. The titer of the library amplified as described above was measured in the same 
manner as described above. As a result, the titer was 2.3 x 10 9 pfu/ml. 

(5) Excision of plasm id from phage DNA 

20 

In vivo excision of the plasmid portion from the recombinant phage DNA was performed. The following solution 
was mixed in 50 ml of a conical tube to cause infection at 37 °C for 15 minutes: 

culture solution of Escherichia coli XL1-Blue (OD^o = 0.1) 200 u.l; 
2S phage solution after amplification 200 ul (> 1 x 10 5 phage particles); 

helper phage R408 1 ul (> 1 x 10 6 pfu/ml). 

The mixed solution was added with 5 ml of 2 x YT medium to perform cultivation at 37 °C for 3 hours with shaking. 
A heat treatment was applied thereto at 70 °C for 20 minutes, followed by centrifugation at 4,000 g for 5 minutes. An 

30 obtained supernatant was decanted and transferred to a sterilized tube. Centrifugation was performed to obtain a 
supernatant which was diluted 100 times to obtain a solution. An aliquot (20 uJ) of the solution was mixed with 200 ul 
of a culture solution of Escherichia coli XL1 -Blue having been cultivated to obtain a value of OD 600 of 1 .0 to cause 
infection at 37 ° C for 15 minutes. Aliquots (1 to 100 ul) of the culture solution were plated on LB plates containing 
ampicitlin, followed by cultivation at 37 °C overnight. Appeared colonies were randomly selected. Selected colonies 

35 were added with glycerol, and they were stored at -80 °C. 

(6) Preparation of plasmid 

Plasmids were prepared by using Magic Mini-prep kit produced by Promega. The culture fluid of Escherichia coli 

40 harboring the plasmid having been stored at -SO °C was inoculated into 5 ml of 2 x YT medium, followed by cultivation 
at 37 °C overnight. Centrifugation was performed for 5 minutes (4,000 rpm, 4 °C), and a supernatant was removed by 
decantation. An obtained bacterial cell pellet was added with 1 ml of TE buffer, followed by a vortex treatment. An 
obtained bacterial cell suspension was transferred to an Eppendorf tube, followed by centrifugation for 5 minutes (5,000 
rpm, 4 ° C). A resultant supernatant was removed by decantation. 

45 An obtained bacterial cell pellet was added with 300 ul of Cell Resuspension Solution, and it was sufficiently 

suspended therein. An obtained suspension was transferred to an Eppendorf tube. The suspension was agitated for 
2 minutes with a mixer, to which 300 ul of Cell Lysis Solution was added, followed by agitation until the suspension 
became transparent. Neutralization Solution (300 u.l) was added thereto, and agitation was performed by shaking with 
the hand, followed by centrifugation for 10 minutes (15,000 rpm). 

50 Only an obtained supernatant was transferred to a new Eppendorf tube (1 .5 ml). A suction tube was prepared, to 

which a cock, a miniature column and a syringe (injector) were connected in this order. A resin in an amount of 1 ml 
was charged into the syringe. The supernatant was poured into the syringe, and agitation was performed well, followed 
by suction. Column Washing Solution in an amount of 2 ml was added, and washing was performed while performing 
suction. Suction was continued for 1 to 2 minutes in order to dry up. The miniature column was removed from the 

55 equipment, and it was set in a new Eppendorf tube (1 .5 ml). Sterilized water in an amount of 100ul having been warmed 
at 65 to 70 °C was poured into the miniature column, and the column and the Eppendorf tube were centrif uged together 
for 1 minute (5,000 rpm). An eluted solution was transferred to an Eppendorf tube, to which 5 ul of 3 M sodium acetate 
aqueous solution was added, and 250 ul of cold ethanol was added thereto. The solution was centrif uged (15,000 rpm, 
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25 minutes), and a supernatant was discarded. An obtained precipitate was added with 1 ml of 70 % ethanol, followed 
by centrif ugatton again (15,000 rpm, 3 minutes). Ethanol was completely removed, and the tube was vacuum-dried in 
a desiccator. The precipitate was sufficiently dissolved in 20 ul of sterilized water, and an obtained solution was stored 
at -20 °C. An aliquot (1 uJ) of the solution was dispensed, and it was subjected to electrophoresis together with volume 
s markers to quantitatively determine the ptasmid DNA. 

<4> Determination of nucleotide sequence of cDN A and homology search with gene data base 

(1) Determination of nucleotide sequence of cDNA 

10 

The nucleotide sequence of cDNA was analyzed by using DNA automatic sequencer 373A produced by Applied 
Bbsystems Inc. (ABI). The sequencing reaction was performed in accordance with an attached manual by using T3 
primer based on the use of Dye Primer Cycle Sequencing Kit produced by the same company The nucleotide sequence 
was determined for about 750 clones which were randomly selected. 

15 

(2) Homology search 

Partial sequences of about 750 clones were searched with a computer using BlastX As a result, three cbnes 
appeared to be homologues of bacterial cellulose synthase subunit. Therefore, it was tried to isolate full length clones. 

20 

<5> Isolation of full length clones 
(1) 5'-RACE 

25 As a result of the homology search, the obtained homologue clones were found to be partial length clones. There- 

fore, primers were synthesized to make elongation toward the 5* upstream so that RTT-PCR was performed by using 
mRNA as a template. 

(1-a) Synthesis of first-strand DNA 

30 

The following solution was prepared and mixed in a tube. 

0.5 uJ 10 umol gene-specific primer 1; 

1 pg total RNA; 

3S DEPC-treated H 2 0 (adjusted to give a total amount of 9 ul). 

The following oligonucleotides were used as the gene -specific primer 1. That is, an oligonucleotide having a nu- 
cleotide sequence shown in SEQ ID NO: 15 was used for PcsA1. An oligonucleotide having a nucleotide sequence 
shown in SEQ ID NO: 1 6 was used for PcsA2. An oligonucleotide having a nucleotide sequence shown in SEQ ID NO: 
40 17 was used for PcsA3. 

The reaction solution was gently mixed, and then it was centrif uged under a mild condition to allow the reaction 
solution to fall to the bottom of the tube. The solution was left to stand at 70 *C for 1 0 minutes, followed by immediate 
cooling on ice. 

Next, the following solution was prepared and mixed in the tube. 

45 

5xRT Buffer 5 pi; 
25 mM MgCI 2 2.5 ul; 

2 mM dNTP mix 5 ui; 
0.1 M DTT 2.5 jaI; 

50 H 2 0 (added to give a total amount of 24 uJ). 

The solution was gently agitated, and then it was centrif uged under a mild condition to allow the reaction solution 
to fall to the bottom of the tube, followed by being left to stand at 42 °C for 1 minute. The solution was added with 1 ul 
of SuperScriptll RT (reverse transcriptase, GIBCO BRL), and it was gently mixed. After that, the reaction was performed 
55 at 42 °C for 50 minutes. Subsequently, the reaction solution was left to stand at 70 °C for 1 5 minutes to stop the 
reaction. Centrif ugation was performed under a mild condition to allow the reaction solution to fall to the bottom of the 
tube, followed by being left to stand at 37 °C. RNase H (produced by Toyobo) in an amount of 1 pi was added thereto 
to perform a reaction at 37 °C for 30 minutes. 
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Subsequently, in order to remove excessive primers and nucleotides contained in the reaction solution, gel filtration 
was performed by using a purification column produced by Boehringer, Quick Spin Columns. At first, the tip of the 
column was removed, followed by centrif ugation at 1 , 1 00 x g for 2 minutes to discard the buffer. The reaction solution 
was introduced into the central area of the column, followed by centrifugation at 1 .1 00 x g for 4 minutes to recover th 
5 solution. 

(1-bl PoMdC) tailing 

An aliquot (5 ul) was dispensed from the obtained solution, to which the following solution was added. 

10 

5 ul 5 x CoClg Buffer; 
2.5 Ml 2 mM dCTP; 

H 2 0 (adjusted to give a total amount of 24 uJ). 

15 The reaction solution was mixed well, and it was left to stand at 94 °C for 3 minutes. Centrifugation was performed 

under a mild condition to allow the reaction solution to fall to the bottom of the tube, followed by being left to stand on 
ice. Terminal transferase TdT (produced by Toyobo) was added thereto in an amount of 1 uJ, followed by mixing under 
a mild condition to perform a reaction at 37 °C for 10 minutes. Subsequently, the reaction solution was left to stand at 
65 °C for 10 minutes to stop the reaction. 

20 

(KO PCR reaction 

An aliquot (2.5 ul) was dispensed from the reaction solution, to which the following solution was added. 

25 2.5 Ml 10 x PCR Buffer; 

2.5 pi 2 mM dNTP mix; 

0.5 m' Gene-specific primer 2; 

0.5 Ml Abridged Anchor Primer (GIBCO BRL); 

0.5 mI Advantage Klentaq Polymerase Mix (Clonlech); 
30 H 2 0 (adjusted to give a total amount of 25 mO 

The following oligonucleotides were used as Gene-specific primer 2. That is : an oligonucleotide having a nucleotide 
sequence shown in SEQ ID NO: 18 was used for PcsA1. An oligonucleotide having a nucleotide sequence shown in 
SEQ ID NO: 19 was used for PcsA2. An oligonucleotide having a nucleotide sequence shown in SEQ ID NO: 20 was 
35 used for PcsA3. 

The solution was introduced into a 0.2 ml tube to perform the PCR reaction under the following condition. 



40 



PAD 


94 °C 


90 seconds 


30 cycles 


94 °C 


30 seconds 




60to68°C 


30 to 60 seconds 




68 °C 


1 80 seconds 


Final 


68 °C 


7 minutes 


Hold 


4°C 





The reaction solution was subjected to agarose gel electrophoresis to extract, from the gel, DNA's corresponding 
to portions having the largest size (about 1 .8 K for PcsA1 , about 2 K for PcsA2, and about 2.2 K for PcsA3). GENO- 
BIND produced by CLONTECH was used for the extraction, and the procedure was carried out in accordance with its 
protocol. The DNA thus obtained was subjected to PoIy(dC)tailing, which was used as a template to perform the PCR 
so reaction. The condition and the composition of the reaction solution were the same as those described above. 

(2) Cloning 

(2-a) 5'-RACE TA cloning 

55 

Starting from the obtained PCR reaction solution, cloning was performed by using TA Cloning Kit produced by 
Invitrogen in accordance with its protocol. 

The following solution was added to an aliquot (1.5 \x\) of the PCR reaction solution obtained as described above. 
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15 



20 



25 



30 



0.5 u.110 x Ligation Buffer; 
1 ulpCRI I vector; 
0.5 >il T4 DNA Ligase; 
1.5 pi drip. 

The reaction was performed at 14 *C overnight. An aliquot (2 pi) of the reaction solution was added to 25 p1 of 
Escherichia coH competent cell (JM109) preparation, followed by being left to stand for 30 minutes on ice After that 
heat shock was applied at 42 °C for 30 seconds. The solution was stationarily left to stand on ice for 2 minutes to 
which 450 uJ of SOB medium was thereafter added to perform cultivation at 37 °C for 1 hour with shaking at 200 rpm 
The culture was spread over Amp/Xgal/IPTG plate, followed by incubation at 37 X overnight. The plasmid was ex- 
tracted from obtained colonies in accordance with the method as described above. 

(2-b) Cloning of complete length cDNA 

The procedure was carried out by using DNA Sequencer 377 produced by ABI in accordance with its protocol 
The sequencing reaction was performed by using M1 3 primer and synthetic oligomer as primers, based on the use of 
Dye Terminater Cycle Sequencing Kit produced by the same company. As a result of the sequencing, as for PcsA3 it 
was revealed that another clone also belonging to the group of PcsA3 but having a slightly different sequence (one 
position for amino acid) was isolated (see Figs. 3 and 4). A nucleotide sequence of a clone (PcsA3-682) containing 
the 3"-side region of PcsA3 and an amino acid sequence deduced from this nucleotide sequence are shown in SEQ 
ID NOs: 5 and 6. A nucleotide sequence of a S'-portion (PcsA3-5') of another clone containing the 5'-side region of 
PcsA3 and an amino acid sequence deduced from this nucleotide sequence are shown in SEQ ID NOs- 7 and 8 A 
nucleotide sequence of a 3'-portion (PcsA3-3') of the clone and an amino acid sequence deduced from this nucleotide 
sequence are shown in SEQ ID NOs: 9 and 10. 

As for PcsAI and PcsA2, primers for 5'-terminal and ^-terminal of a region containing ORF were synthesized on 
the basis of the obtained sequences to perform the PCR reaction. Thus, complete length clones were isolated by 
means of TA cloning. The condition and the composition of the reaction solution were the same as those described 
above. 

Oligonucleotides shown in SEQ ID NO: 21 (5*-terminal) and SEQ ID NO: 22 (3'-terminal) were used as the primers 
for PcsAI. Oligonucleotides shown in SEQ ID NO: 23 (S'-terminal) and SEQ ID NO: 24 (^-terminal) were used as the 
primers for PcsA2. Results are shown in SEQ ID NOs: 1 to 4. 
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Annex to the description 

SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: NISSHINBO INDUSTRIES, INC* 

HAYASH1, Takahisa 
(ii) TITLE OF INVB/TION: CELLULOSE SYNTHASE GENE 
(iii) NUMBER OF SEQUENCES: 24 
(iv) COR RE SPONDENCE ADERESS: 

(A) ADDRESSEE: 

(B) STREET: 

(C) CITY: 

(E) COUOTKY: 

(F) ZIP: 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC ocmpa-tible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 
(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 
(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 9-83133 

(B) FILING DATE: 1 -APR- 199 7 
(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: 

(B) REGISTRATION NUMBER: 
(ix) TELEXXM*mCATION INFORMATION: 

(A) TELEPHONE: 

(B) TELEFAX: 

(2) INFORMATION FOR SEQ ID NO: 1: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3207 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOI££ULE TYPE: cDNA to mRNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Gossypium hirsutum L. 
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to 



15 



20 



25 



30 



35 



40 



45 



50 



(C) INDIVIDUAL ISOLATE: Coker312 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) L0CAnCN;77..30Ol 

(ri) SEQUENCE DESOUFTICN: SEQ ID NO: 1: 
GGTTAGCATA TTCTTTGTAG CATTGGCTIT TTTTCICAAG GAAGAAGAAG GAGAAAGATA 60 
ACTAATCTTT TTGAGA ATC ATG GAA TCT GGG GTT OCT GIT TQC CAC ACT 109 
Met Met Glu Ser Gly Val Pro Val Cys His Thr 
15 10 
TCT GCT GAA CAT GTT GGG TTG AAT GIT AAT GCT GAA 00C TTT GTG GCT 157 
Cys Gly Glu His Val Gly Leu Asn Val Asn Gly Glu Pro Fhe Val Ala 

15 20 25 

TOC CAT GAA TCT AAT TIC OCT ATT TCT AAG ACT TCT TTT GAG TAT GAT 205 
Cys His Glu Cys Asn Fhe Pro lie Cys Lys Ser Cys Phe Glu Tyr Asp 

30 35 40 

CTT AAG GAA GGA CAA AAA GCT TQC TTG OCT TCT GCT ATT O0G TAT GAT 253 
Leu Lys Glu Gly Gin Lys Ala Cys Leu Arg Cys Gly lie Pro Tyr Asp 

45 50 55 

GAA AAC CFG TTG GAC GAT GTC GAG AAG GOC AOC GGC GAT CAA TOG ACA 301 
Glu Asn Leu Leu Asp Asp Val Glu Lys Ala Thr Gly Asp Gin Ser Thr 

60 65 70 75 

ATG GCT GCA CAT TTG AGC AAG TCT CAG GAT GTT GGA ATT CAT GCA AGA 349 
Met Ala Ala His Leu Ser Lys Ser Gin Asp Val Gly He His Ala Arg 

80 85 90 

CAT ATC AGC ACT GTC TCT ACA TTG GAT ACT GAA ATG ACT GAA GAC AAT 397 
His He Ser Ser Val Ser Thr Leu Asp Ser Glu Ntet Thr Glu Asp Asn 

95 100 105 

GGG AAT COG ATT TOG AAG AAC AQG GTG GAA ACT TOG AAA GAA AAG AAG 445 
Gly Asn Pro lie Trp Lys Asn Arg Val Glu Ser Trp Lys Glu Lys Lys 

110 115 120 

AAC AAG AAG AAG AAG OCT GCA ACA ACT AAG GTT GAA AGA GAG GCT GAA 493 
Asn Lys Lys Lys Lys Pro Ala Thr Thr Lys Val Glu Arg Glu Ala Glu 

125 130 135 

ATC OCA OCT GAG CAA CAA ATC GAA GAT AAA COG GCA COG GAT GCT TOC 541 
He Pro Pro Glu Gin Gin Met Glu Asp Lys Pro Ala Pro Asp Ala Ser 
140 145 150 155 

CAG C0C CTC TOG ACT ATA ATT OCA ATC 00G AAA AGC AGA CTT GCA OCA 589 
Gin Pro Leu Ser Thr He He Pro He Pro Lys Ser Arg Leu Ala Pro 

160 165 170 

TAC OGA AOC GTG ATC ATT ATC OGA TTG ATC ATT CTC GCT CTT TIC TTC 637 
Tyr Arg Thr Val He He Met Arg Leu He He Leu Gly Leu Fhe Phe 
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175 180 185 

CAT TAT OGA CTA ACA AAC OOC GTT GAC ACT GCT TTT GGA CTG TCX5 CTC 685 
His Tyr Arg Val Thr Asn Pro Val Asp Ser Ala Phe Gly Leu Trp Leu 
5 190 195 200 

ACT TCA CTC ATA TGT GAA ATC TGG TTT GCT TTT TOC TGG OTG TTG GAT 733 

Thr Ser Val lie Cys Glu He Trp Phs Ala Phe Ser Trp Val Lew Asp 
205 210 215 

10 CAG TTC OCT AAG TGG TAT OCT GTT AAC AGG GAA ACA TAC ATT GAC AGA 781 

Gin Phe Pro Lys Trp Tyr Pro Val Asn Arg Glu Thr Tyr He Asp Arg 
220 225 230 235 

CTG TCT GCA AGA TAT GAA AGA GAA GGT GAA OCT AAT GAA CTT GCT GCA 829 
15 Leu Ser Ala Arg Tyr Glu Arg Glu Gly Glu Pro Asn Glu Leu Ala Ala 

240 245 250 

GTT GAC TTC TTT CTG ACT ACA CTG GAT OCA TTG AAA GAG OCT OCA TTG 877 
Val Asp Hie Phs Val Ser Thr Val Asp Pro Leu Lys Glu Pro Pro Lew 
20 255 260 265 

ATT ACT G0C AAT ACT CTG CTT TOC ATC CTT QX TTG GAC TAC COG GTA 925 
He Thr Ala Asn Thr Val Leu Ser He Leu Ala Leu Asp Tyr Pro Val 
270 275 280 

25 GAT AAG CTC TCT TGT TAT ATA TCT GAT GAT GCT GOG G0C ATG CTG ACA 973 

Asp Lys Val Ser Cys Tyr He Ser Asp Asp Gly Ala Ala Met Leu Thr 

285 290 295 

TTT GAA TCT CTA GTA GAA ACA G0C GAC TTT GCA AGA AAG TOG GTT OCA 1021 
Phe Glu Ser Leu Val Glu Thr Ala Asp Phe Ala Arg Lys Trp Val Pro 
300 305 310 315 

TTC TGC AAA AAA TTT TOC ATT GAA CCA 0GG GCA OCT GAG TTT TAC TTC 1069 
Phe Cys Lys Lys Phe Ser He Glu Pro Arg Ala Pro Glu Phe Tyr Phe 

320 325 330 

TCA CAG AAG ATT GAT TAC TTG AAA GAT AAA GTG CAG OOC TCT TTT CTA 1117 
Ser Gin Lys He Asp Tyr Leu Lys Asp Lys Val Gin Pro Ser Phe Val 

335 340 345 

AAA GAA OCT AGA GCT ATG AAA AGA GAT TAC GAA GAG TAC AAA ATT CGA 1165 
Lys Glu Arg Arg Ala Met Lys Arg Asp Tyr Glu Glu Tyr Lys He Arg 

350 355 360 

ATC AAT GCT TTA GTT GCA AAG GCT CAG AAA ACA OCT GAA GAA GGA TGG 1213 
45 He Asn Ala Leu Val Ala Lys Ala Gin Lys Thr Pro Glu Glu Gly Trp 

365 370 375 

ACA ATG CAA GAT GGA ACT OCT TGG 00G GGA AAT AAC OCG OCT GAT CAC 1261 
Thr Met Gin Asp Gly Thr Pro Trp Pro Gly Asn Asn Pro Arg Asp His 
so 380 385 390 395 

OCT GGC ATG ATT CAG CTT TTC CTT GGA TAT AGC GCT GCT CAT GAC ATC 1309 
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Pro Gly Met lie Gin Val Fhe l£u Gly Tyr Ser Gly Ala His Asp lie 

400 405 410 

GAA GGA AAT GAA CTT OOC OGA CTG GTT TAC GTC TCT AGA GAG AAG AGA 1357 
Glu Gly Asn Glu Leu Pro Arg Leu Val Tyr Val Ser Arg Glu Lys Arg 

415 420 425 

OCT GGC TAC CAA CAC CAC AAA AAG OCT OCT OCT GAA AAT OCT TTG GTT 1405 
Pro Gly Tyr Gin His His lys lys Ala Gly Ala Glu Asn Ala Leu Val 

430 435 440 

AGG CTG TCT GCA GTT CTT ACA AAT OCT COC TTC ATC CTC AAT CTT GAT 1453 
Arg Val Ser Ala Val Leu Thr Asn Ala Pro Fhe lie Leu Asn Leu Asp 

445 450 455 

TCT GAC CAC TAT CTT AAC AAT AGC AAG GCA GTT AGG GAG GCA ATG TGC 1501 
Cys Asp His Tyr Val Asn Asn Ser Lys Ala Val Arg Glu Ala tet Cys 
460 465 470 475 

TTC TTG ATG GAC CCA CAA GTC GGT OGA GAT GTC TGC TAT GTG CAG TTT 1549 
Phe Leu Met Asp Pro Gin Val Gly Arg Asp Val Cys Tyr Val Gin Fhe 

480 485 490 

OCT CAA AGA TTT GAT GGC ATA GAT AGG ACT GAT CGA TAT GCC AAT OGG 1597 
Pro Gin Arg Phe Asp Gly lie Asp Arg Ser Asp Arg Tyr Ala Asn Arg 

495 500 505 

AAC ACA CTT TTC TTT GAT GTT AAC ATG AAA GGT CTT GAT GGA ATC CAA 1645 
Asn Thr Val Fhe Fhe Asp Val Asn Met Lys Gly Leu Asp Gly lie Gin 

510 515 520 

GGG OCT GTT TAT GTG GGA ACA GGT TCT GTT TTC AAT AGG CAA GCA CTT 1693 
Gly Pro Val Tyr Val Gly Thr Gly Cys Val Phe Asn Arg Gin Ala Lai 

525 530 535 

TAT GGC TAT GGT CCA OCT TCA ATG OCA ACT TTT OOC AAG TCA TOC TOC 1741 
Tyr Gly Tyr Gly Pro Pro Ser Met Pro Ser Phe Pro Lys Ser Ser Ser 
540 545 550 555 

TCA TCT TGC TOG TCT TGC TGC COC GGC AAG AAG GAA OCT AAA GAT OCA 1789 
Ser Ser Cys Ser Cys Cys cys Pro Gly Lys Lys Glu Pro Lys Asp Pro 
40 560 565 570 

TCA GAG CTT TAT AGG GAT GCA AAA OGG GAA GAA CTT GAT GCT GCC ATC 1837 
Ser Glu Leu Tyr Arg Asp Ala Lys Arg Glu Glu Leu Asp Ala Ala lie 
575 580 585 

45 TTT AAC CTT AGG GAA ATT GAC AAT TAT GAT GAG TAT GAA AGA TCA ATG 1885 

Phe Asn Leu Arg Glu He Asp Asn Tyr Asp Glu Tyr Glu Arg Ser Ntet 

590 595 600 

TTG ATC TCT CAA ACA AGC TTT GAG AAA ACT TTT GGC TEA TCT TCA GTC 1933 
teu lie Ser Gin Thr Ser Fhe Glu Lys Thr Fhe Gly Leu Ser Ser Val 
605 610 615 
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TTC ATT GAA TCT ACA CTA ATG GAG AAT GGA GGA GTG GCT C3\A TCT GOC 1981 
pus lie Glu Ser Thr Leu Met Glu Asn Gly Gly Val Ala Glu Ser Ala 
620 625 630 635 

AAC OCT TOC ACA CTA ATC AAG GAA GCA ATT CAT GTC ATC GGC TCT GGC 2029 
Asn Pro Ser Tfrr Leu lie Lys Glu Ala He His Val He Gly Cys Gly 

640 645 650 

TAT GAG GAG AAG ACT GCA TGG QGG AAA GAG ATT GGA TGG ATA TAT GGT 2077 
Tyr Glu Glu Lys Thr Ala Trp Gly Lys Glu He Gly Trp He Tyr Gly 

655 660 665 

TCA CTC ACT GAG GAT ATC TTA ACC GGC TTC AAA ATG CAC TOC CGA GGA 2125 
Ser Val Thr Glu Asp He Leu Thr Gly Phe Lys Met His Cys Arg Gly 

670 675 680 

TOG AGA TOG ATT TAC TOC ATG COC TTA AGG CCA GCA TTC AAA GGA TCT 2173 
Trp Arg Ser He Tyr Cys Met Pro Leu Arg Pro Ala Phe Lys Gly Ser 
685 690 695 

20 GCA COC ATC AAT CTC TCT GAT OGG TTG CAC CAG GTT CTT CGA TGG GCT 2221 

Ala Pro He Asn Leu Ser Asp Arg Leu His Gin Val Leu Arg Trp Ala 
700 705 710 715 

CTT GGA TCT GTT GAA ATT TTC CTA AGC AGG CAT TOC OCT CTA TGG TAT 2269 
25 Leu Gly Ser Val Glu He Phe Leu Ser Arg His Cys Pro Leu Trp Tyr 

720 725 730 

GGC TTT GGA GCT GGT OCT CTT AAA TGG CTT CAA AGA CTA GCA TAT ATA 2317 
Gly Flie Gly Gly Gly Arg Leu Lys Trp Leu Gin Arg Leu Ala Tyr He 
30 735 740 745 

AAC AOC ATT GTC TAT OCT TTC ACA TOC CTT OCA CTC ATT GOC TAT TCT 2365 
Asn Thr He Val Tyr Pro Phe Thr Ser Leu Pro Leu He Ala Tyr Cys 

750 755 760 

TCA CTA CCA GCA ATC TCT CTT CTC ACA GGA AAA TTT ATC ATA OCA AGG 2413 
Ser Leu Pro Ala He Cys Leu Leu Thr Gly Lys Phe He He Pro Thr 

765 770 775 

CTC TCA AAC CTG GCA ACT CTT CTC TTT CTT GGC CTT TTC CTT TOC ATT 2461 
Lai Ser Asn Leu Ala Ser Val Leu Rie Leu Gly Lai Phe Leu Ser He 
780 785 790 795 

ATC GTG ACT GCT GTT CTC GAG CTC CGA TOG ACT GCT GTC AGC ATT GAG 2509 
He Val Thr Ala Val Leu Glu Leu Arg Trp Ser Gly Val Ser He Glu 

800 805 810 

GAC TTA TOG OCT AAC GAG CAG TTT TGG GTC ATC GCT GGC GTT TCA GOC 2557 
Asp Lai Trp Arg Asn Glu Gin Hie Trp Val He Gly Gly Val Ser Ala 

815 820 825 

CAT CTC TTT GOC GTC TTC CAA GCT TTC CTT AAG ATG CTT GOG GGC ATT 2605 
His Lai Phe Ala Val Phe Gin Gly Phe Leu lys Met Leu Ala Gly He 
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830 835 840 

GftC AGC AAC ITT ACT CTC ACT GOC AAA OCA OCT GAT GAT OCA OWP TTT 2653 
Asp Tnr Asa Phe ttrr Val Thr Ala Lys Ala Ala Asp Asp Ala Asp Phe 

845 850 855 

OCT GAG CTC TAC ATT OTG AAA TOG ACT ACA CTT CTA ATC OCT OCA ACA 2701 
Gly Glu Leu Tyr lie Val Lys Trp Thr Thr Leu Leu lie Pro Pro Tnr 
860 865 870 875 

ACA CTC CTC ATC CTC AAC ATC CTT OCT CTC CTT GOC GGA TIC TCC GAT 2749 
Thx teu Leu lie Val Asn Met Val Gly Val Val Ala Gly Fhe Sex Asp 

880 885 890 

GOC CTC AAC AAA GOG TAC GAA GCT TOG GGA CCA CTC TTT GOC AAA GTG 2797 
Ala Leu Asn lys Gly Tyr Glu Ala Trp Gly Pro Leu Fne Gly lys Val 

895 900 905 

TTC TTT TCC TTC TOG GTC ATC CTC CAT CTT TAT CCA TTC CTC AAA GCT 2845 
Phe Phe Ser Phe Trp Val He Leu His Leu Tyr Pro Pte Leu Lys Gly 
20 910 915 920 

CTT AIG GGA OGC CAA AAC AGG ACA OCA AOC ATT GTT GTC CTT TOG TCA 2893 
leu Met Gly Arg Gin Asn Arg Ihr Pro Thr lie Val Val Leu Trp Ser 
925 930 935 

25 GTG TTC TTC GCT TOT GTC TTC TCT CTT GTT TOG GTT COG ATC AAC COG 2941 

Val Leu Leu Ala Ser Val Fte Ser Leu Val Trp Val Arg lie Asn Pro 
940 945 950 955 

TTT CTC AGC ACC GOC GAT AGC AOC AOC GTG TCA GAG AGC TOC ATT TCC 2989 
30 Phe Val Ser thr Ala Asp Ser Thr Thr Val Ser Gin Ser Cys He Ser 

960 965 970 

ATT GAT TCT TCATGATATT ATCTCTITCT TAGAATPGAA ATCATTOCAA 3038 
lie Asp Cys 

35 GTAACTGGAC TCAAACATGT CTATTGACTA AG1TTTGAAC AGTTTGTACC CATTTTATTC 3098 

TTAGCAGTCT CTAA3TTT0C TAAACAATGC TATGAACTAT ACATATTTCA TTGATATTTA 3158 
CATTAAATCA AACTACATCA GTCT9CAGAA AAAAAAAAAA AAAAAAAAA 3207 
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(2) INFORMATION FOR SEQ ID NO: 2: 
(1) SEQUENCE CHARACTERISTICS: 

(A) LENOTH: 974 amino adds 

(B) TYPE: amino acid 
(D) TOPOLOGY: Tjrtear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 2: 
Met Met Glu Ser Gly Val Pro Val Cys His Ttir Cys Gly Glu His Val 

15 10 15 

Gly Leu Asn Val Asn Gly Glu Pro Phe Val Ala Cys His Glu Cys Asn 
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20 25 30 

Phe Pro lie Cys Lys Ser Cys Phe Glu Tyr Asp Leu Lys Glu Gly Gin 

35 40 45 

Lys Ala Cys Leu Arg Cys Gly lie Pro Tyr Asp Glu Asn Leu Leu Asp 

50 55 60 

Asp Val Glu Lys Ala Thr Gly Asp Gin Ser Thr Met Ala Ala His Leu 

65 70 75 80 

Ser Lys Ser Gin Asp Val Gly lie His Ala Arg His He Ser Ser Val 

85 90 95 

Ser Thr Leu Asp Ser Glu Met Thr Glu Asp Asn Gly Asn Pro lie Trp 

100 105 HO 

Lys Asn Arg Val Glu Ser Trp Lys Glu Lys Lys Asn Lys Lys Lys Lys 

115 120 125 

Pro Ala Thr Thr Lys Val Glu Arg Glu Ala Glu lie Pro Pro Glu Gin 

130 135 140 

Gin Met Glu Asp Lys Pro Ala Pro Asp Ala Ser Gin Pro Leu Ser Thr 
145 150 155 160 

He lie Pro He Pro Lys Ser Arg Leu Ala Pro Tyr Arg Thr Val He 

165 170 175 

He Met Arg Leu He He Leu Gly Leu Phe Phe His Tyr Arg Val Thr 

180 185 190 

Asn Pro Val Asp Ser Ala Phe Gly Leu Trp Leu Thr Ser Val He Cys 

195 200 205 

Glu He Trp Phe Ala Phe Ser Trp Val Leu Asp Gin Phe Pro Lys Trp 

210 215 220 

Tyr Pro Val Asn Arg Glu Thr Tyr He Asp Arg Lai Ser Ala Arg Tyr 
225 230 235 240 

Glu Arg Glu Gly Glu Pro Asn Glu Leu Ala Ala Val Asp Phe Phe Val 

245 250 255 

Ser Thr Val Asp Pro Leu Lys Glu Pro Pro Leu He Thr Ala Asn Thr 

260 265 270 

Val Leu Ser He Leu Ala Leu Asp Tyr Pro Val Asp Lys Val Ser Cys 

275 280 285 

Tyr He Ser Asp Asp Gly Ala Ala Met Leu Thr Phe Glu Ser Leu Val 

290 295 300 

Glu Thr Ala Asp Phe Ala Arg Lys Trp Val Pro Phe Cys Lys Lys Phe 
305 310 315 320 

Ser He Glu Pro Arg Ala Pro Glu Phe Tyr Phe Ser Gin Lys He Asp 

325 330 335 

Tyr Leu Lys Asp Lys Val Gin Pro Ser Phe Val Lys Glu Arg Arg Ala 
340 345 350 



19 



EP 0 875 575 A2 



Met Lys Arg Asp Tyr Glu Glu Tyr Lys He Arg He Asn Ala Leu Val 

355 360 365 

Ala Lys Ala Gin Lys Thr Pro Glu Glu Gly Trp Thr Met Gin Asp Gly 

370 375 380 

Thr Pro Trp Pro Gly Asn Asn Pro Arg Asp His Pro Gly mt lie Gin 
385 390 395 400 

Val Fhe Leu Gly Tyr Ser Gly Ala His Asp He Glu Gly Asn Glu Lai 

405 410 415 

Pro Arg leu Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Tyr Gin His 

420 425 430 

His Lys Lys Ala Gly Ala Glu Asn Ala Lai Val Arg Val Ser Ala Val 

435 440 445 

teu Thr Asn Ala Pro Fhe lie Leu Asn Leu Asp Cys Asp His Tyr Val 

450 455 460 

Asn Asn Ser Lys Ala Val Arg Glu Ala Met Cys Fhe Leu lYtet Asp Pro 
465 470 475 480 

Gin Val Gly Arg Asp Val Cys Tyr Val Gin Fhe Pro Gin Arg Fhe Asp 

485 490 495 

Gly lie Asp Arg Ser Asp Arg Tyr Ala Asn Arg Asn Thr Val Fhe Fhe 

500 505 510 

Asp Val Asn Met Lys Gly Leu Asp Gly He Gin Gly Pro Val Tyr Val 

515 520 525 

Gly Thr Gly Cys Val Phe Asn Arg Gin Ala Leu Tyr Gly Tyr Gly Pro 

530 535 540 

Pro Ser Met Pro Ser Fhe Pro Lys Ser Ser Ser Ser Ser Cys Ser Cys 
545 550 555 560 

Cys Cys Pro Gly Lys Lys Glu Pro Lys Asp Pro Ser Glu Lai Tyr Arg 

565 570 575 

Asp Ala Lys Arg Glu Glu Leu Asp Ala Ala He Phe Asn Lai Arg Glu 

580 585 590 

He Asp Asn Tyr Asp Glu Tyr Glu Arg Ser Met Leu He Ser Gin Thr 

595 600 605 

Ser Fhe Glu Lys Thr Phe Gly Leu Ser Ser Val Phe He Glu Ser Thr 

610 615 620 

I*»i Met Glu Asn Gly Gly Val Ala Glu Ser Ala Asn Pro Ser Thr Lai 
625 630 635 640 

He Lys Glu Ala He His Val He Gly Cys Gly Tyr Glu Glu Lys Thr 

645 650 655 

Ala Trp Gly Lys Glu He Gly Trp He Tyr Gly Ser Val Thr Glu Asp 

660 665 670 

He leu Thr Gly Phe Lys Met His Cys Arg Gly Trp Arg Ser He Tyr 
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675 680 685 

Cys Met Pro Leu Arg Pro Ala Phe Lys Gly Ser Ala Pro He Asn Leu 

690 695 700 

Ser Asp Arg Leu His Gin Val Leu Arg Tip Ala Leu Gly Ser Val Glu 
705 710 715 720 

lie Phe Leu Ser Arg His Cys Pro Leu Trp Tyr Gly Phe Gly Gly Gly 

725 730 735 

Arg Leu Lys Trp Leu Gin Arg Leu Ala Tyr He Asn Thr He Val Tyr 

740 745 750 

Pro Phe Thr Ser Leu Pro Leu He Ala Tyr Cys Ser Leu Pro Ala He 

755 760 765 

Cys Leu Leu Thr Gly Lys Phe He He Pro Thr Leu Ser Asn Leu Ala 

770 775 780 

Ser Val Leu Phe Leu Gly Leu Phe Leu Ser He He Val Thr Ala Val 
785 790 795 BOO 

Leu Glu Leu Arg Trp Ser Gly Val Ser He Glu Asp Leu Trp Arg Asn 

BOS 810 815 

Glu Gin Phe Trp Val He Gly Gly Val Ser Ala His Leu Phe Ala Val 
25 820 825 830 

Phe Gin Gly Phe Leu Lys Met Lai Ala Gly He Asp Thr Asn Phe Thr 

835 840 845 

Val Thr Ala Lys Ala Ala Asp Asp Ala Asp Phe Gly Glu Leu Tyr He 

850 855 860 

Val Lys Trp Thr Thr Leu Leu He Pro Pro Thr Thr Leu Leu He Val 
865 870 875 880 

Asn Met Val Gly Val Val Ala Gly Phe Ser Asp Ala Leu Asn Lys Gly 

885 890 895 

Tyr Glu Ala Trp Gly Pro Leu Phe Gly Lys Val Phe Phe Ser Phe Trp 
900 905 910 

40 Val He Leu His Leu Tyr Pro Phe Leu Lys Gly Leu Met Gly Arg Gin 

915 920 925 

Asn Arg Thr Pro Thr lie Val Val Leu Trp Ser Val Leu Leu Ala Ser 

930 935 940 

Val Phe Ser Leu Val Trp Val Arg He Asn Pro Phe Val Ser Thr Ala 
945 950 955 960 

Asp Ser Thr thr Val Ser Gin Ser Cys He Ser He Asp Cys 
965 970 
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(2) INFORMATION FOR SEQ ID NO: 3: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3311 base pairs 



RNsnnnirv <fp cwtactka? i > 
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(B) TOPE: nucleic add 

(C) STRANDEENESS: double 

(D) TOPOLOGY: lirear 
(11) MDI£CUL£ TOPE: CDNA to mRNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Gossypium hirsutum L. 
(C) INDIVIDUAL ISOLATE: Coker312 
(Ik) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 23. .3142 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CTTTOCTTCT TTTGGTTrrG OC ATG OCT TCA ACC ACC ATG GOC OCT GGC TTT 52 

Met Ala Ser Thr Thr Met Ala Ala Gly Phe 
1 5 10 

GGT TCA CTT GCT CTT GAC GAG AAT COG GGA TCA TOG ACA CAT CAA TCA 100 
Gly Ser Leu Ala Val Asp Glu Asn Arg Gly Ser Ser Thr His Gin Ser 

15 20 25 

TCA AOG AAA ATA TQC AGG GTG TGT GOG GAT AAG ATC GGG CAA AAG GAA 148 
Ser Thr Lys He Cys Arg Val Cys Gly Asp Lys lie Gly Gin Lys Glu 

30 35 40 

AAC GGA CAA COG TTC GTG OCT TGT CAT GTC TOT GCT TTC COG CTT TGC 196 
Asn Gly Gin Pro Pte Val Ala Cys His Val Cys Ala Phe Pro Val Cys 

45 50 55 

OCT OCT TGT TAT GAA TAT GAA AGG ACT GAA GGA AAC CAG TGC TGT OCT 244 
Arg Pro Cys Tyr Glu Tyr Glu Arg Ser Glu Gly Asn Gin Cys Cys Pro 

60 65 70 

CAG TGC AAT ACT CGC TAT AAG OCT CAC AAA GCT ACT OCA AGA ATT TCA 292 
Gin Cys Asn Thr Arg Tyr Lys Arg His Lys Gly Ser Pro Arg He Ser 

75 80 85 90 

GGA GAT GAA GAA GAT GAT TCA GAT CAA GAT GAT TTT GAT GAT GAA TTT 340 
Gly Asp Glu Glu Asp Asp Ser Asp Gin Asp Asp Phe Asp Asp Glu Phe 

95 100 105 

CAG ATT AAG AAC CGC AAG GAT GAC TOC CAT OCA CAA CAT GAA AAT GAG 388 
Gin He Lys Asn Arg Lys Asp Asp Ser His Pro Gin His Glu Asn Glu 

110 115 120 

GAA TAT AAT AAT AAT AAT CAT CAA TOG CAT OCC AAT GGT CAA GCT TTC 436 
Glu Tyr Asn Asn Asn Asn His Gin Trp His Pro Asn Gly Gin Ala Pte 

125 130 135 

TCA GTT GOC GGA AGC AOG GOG GGG AAG GAT TTC GAA GGG GAT AAA GAG 484 
Ser Val Ala Gly Ser Thr Ala Gly Lys Asp Leu Glu Gly Asp Lys Glu 
140 145 150 
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ATT TAC GGA AGC GAA GAA TGG AAA GAA AGA GTT GAG AAA TGG AAA GTC 532 

lie Tyr Gly Ser Glu Glu Trp Lys Glu Arg Val Glu Lys Trp Lys Val 

155 160 165 170 

AGG CAA GAA AAA AGA GGT TTG GTA AGC AAC GAT AAT GGC GGA AAT GAT 580 

Arg Gin Glu Lys Arg Gly I*ai Val Ser Asn Asp Asn Gly Gly Asn Asp 

175 180 185 

OCT OCT GAA GAA GAT GAT TAT CTC TTG (XT GAA GCT OCX: CAG OCT CTA 628 
Pro Pro Glu Glu Asp Asp Tyr Lai Leu Ala Glu Ala Arg Gin Pro Leu 

190 195 200 

TQ& CGA AAA GTG OCA ATT TOG TCA ACT CTG ATA AGC OCT TAC OGG ATA 676 
Trp Arg Lys Val Pro He Ser Ser Ser Leu He Ser Pro Tyr Arg He 

205 210 215 

GTC ATC GTC CTC CGA TTC TTC ATC CTC GCA TTT TTC CTC CGG TTC CGT 724 
Val He Val Leu Arg Phe Phe He Leu Ala Hie Phe Leu Arg Phe Arg 

220 225 230 

ATT CTA ACA CCC GOC TAC GAC GCT TAC COG TTA TGG CTA ATC TCT CTC 772 
He Leu Thr Pro Ala Tyr Asp Ala Tyr Pro Leu Trp Leu He Ser Val 
235 240 245 250 

ATC TGC GAA GTT TGG TTC GOC TTC TCC TGG ATT CTC GAT CAG TTC CCT 820 
He Cys Glu Val Trp Hie Ala Ffce Ser Trp He Leu Asp Gin Rie Pro 

255 260 265 

AAA TGG TTC OCT ATT ACT OGC GAA ACT TAC CTC GAT CGC CTC TCC TTG 868 
Lys Trp Phe Pro He Thr Arg Glu Thr Tyr Leu Asp Arg Leu Ser Leu 

270 275 280 

AGG TTC GAA OCT GAA GGA GAG CCC AAT CAA CTT GGC CCC CTC GAC GTC 916 
Arg Phe Glu Arg Glu Gly Glu Pro Asn Gin Leu Gly Pro Val Asp Val 

285 290 295 

TTC GTC ACT A0C CTT GAC CTT CTC AAG GAA C0C CCC ATC ATA ADC GOC 964 
Phe Val Ser Thr Val Asp Leu Leu Lys Glu Pro Pro He He Ihr Ala 

300 305 310 

AAC GOG GTT CTA TOG ATC TTG GOC GTC GAT TAC COG GTC GAG AAA GTG 1012 
Asn Ala Val Leu Ser He Leu Ala Val Asp Tyr Pro Val Glu Lys Val 
315 320 325 330 

TCT TCT TAT GTG TOG GAC GAT GGT GCT TCC ATG CTT CTT TTC GAT TOG 1060 
Cys Cys Tyr Val Ser Asp Asp Gly Ala Ser Met Leu Leu Phe Asp Ser 

335 340 345 

TTG TCT GAA AGG GCT GAG TTC GOG AGG AGA TGG GTT COG TTT TCT AAG 1108 
Leu Ser Glu Thr Ala Glu Phe Ala Arg Arg Trp Val Pro Phe Cys Lys 

350 355 360 

AAG CAT AAT GTT (3*G CCC AGG GOG 00G GAG TTT TAT TTC AAT GAG AAG 1156 
Lys His Asn Val Glu Pro Arg Ala Pro Glu Phe Tyr Phe Asn Glu Lys 
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365 370 375 

ATT GAT TAT TTG AAG GftC AAG CTC CAT OCT AGC TTT GTT AAA GAA CGG 1204 
lie Asp Tyr Leu Lys Asp Lys Val His Pro Ser Pte Val Lys Glu Arg 

380 385 390 

AGA GOC ATG AAA AGG GAA TAT GAA GAA TTT AAA GTA AGG ATC AAT GCA 1252 
Arg Ala Met Lys Arg Glu Tyr Glu Glu Pte Lys Val Arg lie Asn Ala 
395 400 405 410 

TTA GTA GCA AAA GCT CAG AAG AAA OCA GAA GAA GGA TOG GTC ATG CAA 1300 
Leu Val Ala Lys Ala Gin Lys Lys Pro Glu Glu Gly Trp Val Met Gin 

415 420 425 

GAT GGC AOC CCA TOG C0C GGA AAT AAC ACT OCT GAT CAT OCT GGA ATC 1348 
Asp Gly Thr Pro Trp Pro Gly Asn Asn Thr Arg Asp His Pro Gly Met 

430 435 440 

ATT CAG GTC TAT CTA GGA ACT GOC GCT GCA CTC GAT GTG GAT GGC AAA 1396 
He Gin Val Tyr Leu Gly Ser Ala Gly Ala Lai Asp Val Asp Gly Lys 

445 450 455 

GAG CTG OCT OGA CTT GTC TAT GTT TCT OCT GAG AAA OGA OCT GCT TAT 1444 
Glu Leu Pro Arg Leu Val Tyr* Val Ser Arg Glu Lys Arg Pro Gly Tyr 

460 465 470 

CAG CAC CAT AAG AAA GOC GCT GCT GAG AAT GCT CTG CTT OGA GTT TCP 1492 
Gin His His lys Lys Ala Gly Ala Glu Asn Ala Leu Val Arg Val Ser 
475 480 485 490 

GCA GTG CTT ACT AAT GCA CCC TTC ATA TTG AAT CTG GAT TCT GAT CAT 1540 
Ala Val Leu Thr Asn Ala Pro Phe lie Leu Asn Leu Asp Cys Asp His 

495 500 505 

TAC ATC AAC AAT AGC AAG GOC ATG AGG GAA GOG ATG TOC TTT TTA ATC 1588 
Tyr lie Asn Asn Ser Lys Ala Met Arg Glu Ala Met Cys Phe Leu Met 
35 510 515 520 

GAT OCT CAG TTT GGA AAG AAG CTT TCT TAT GTT CAA TTT OCA CAG AGA 1636 
Asp Pro Gin Phe Gly Lys Lys Leu Cys Tyr Val Gin Phe Pro Gin Arg 
525 530 535 

40 TTT GAT GCT ATT GAT OCT CAT OVT OGA TAT GCT AAT OGA AAT CTT CTC 1684 

Phe Asp Gly lie Asp Arg His Asp Arg Tyr Ala Asn Arg Asn Val Val 

540 545 550 

TTC TTT GAT ATC AAC ATG TTG GGA TTA GAT GGA CTT CAA GGC OCT CTA 1732 
Phe Phe Asp lie Asn Met Leu Gly Leu Asp Gly Leu Gin Gly Pro Val 
555 560 565 570 

TAT CTA GGC ACA GGG TCT CTT TTC AAC AGG CAG GCA TTG TAT GGC TAC 1780 
Tyr Val Gly Thr Gly Cys Val Pte Asn Arg Gin Ala Lai Tyr Gly Tyr 

575 580 585 

GAT CCA OCA CTC TCT GAG AAA OGA OCA AAG ATG ACA TCT GAT TOC TOG 1828 
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Asp Pro Pro Val Ser Glu Lys Arg Pro Lys Met Thr Cys Asp Cys Trp 

590 595 600 

OCT TCT TOG TOT TGC TCT TGT TGC GGA GGT TCT AGG AAG AAA TCA AAG 1876 
Pro Ser Trp Cys Cys Cys Cys Cys Gly Gly Ser Arg Lys Lys Ser Lys 

605 610 615 

AAG AAA GGT GAA AAG AAG GGC TTA CTC GGA GGT CTT TTA TAC GGA AAA 1924 
Lys Lys Gly Glu Lys Lys Gly Leu Leu Gly Gly Leu Leu Tyr Gly Lys 

620 625 630 

AAG AAG AAG ATG ATG GGC AAA AAC TAT GTG AAA AAA GGG TCT GCA OCA 1972 
Lys Lys Lys Met Met Gly Lys Asn Tyr Val Lys Lys Gly Ser Ala Pro 
635 640 645 650 

GTC TTT GAT CTC GAA GAA ATC GAA GAA GGG CTT GAA GGA TAC GAA GAA 2020 
Val Phe Asp Leu Glu Glu He Glu Glu Gly Leu Glu Gly Tyr Glu Glu 

655 660 665 

TTG GAG AAA TOG ACA TTA ATG TOG CAG AAG AAT TTC GAG AAA CGA TTC 2068 
Leu Glu Lys Ser Thr Leu Met Ser Gin Lys Asn Phe Glu Lys Arg Phe 

670 675 680 

GGA CAA TCA COG GTT TTC ATT G0C TCA ACT TTG ATG G3\A AAT GGT GGC 2116 
Gly Gin Ser Pro Val Phe He Ala Ser Thr Leu Met Glu Asn Gly Gly 

685 690 695 

CTT OCT GAA GGA ACT AAT TCC ACA TCA CTG ATT AAA GAG GCC ATT CAC 2164 
Leu Pro Glu Gly Thr Asn Ser Thr Ser Leu He Lys Glu Ala He His 

700 705 710 

GTA ATT AGC TGT GGT TAT GAA GAA AAA ACT GAG TGG GGC AAA GAG ATC 2212 
Val He Ser Cys Gly Tyr Glu Glu Lys Thr Glu Trp Gly Lys Glu He 
715 720 725 730 

GGA TGG ATT TAT GGG TOG GTG AOG GAA GAT ATA TTA ACA GGT TTC AAG 2260 
Gly Trp He Tyr Gly Ser Val Thr Glu Asp He Leu Thr Gly Phe Lys 

735 740 745 

ATG CAT TGT AGA GGG TGG AAA TOG GTT TAT TGT OTA COG AAA AGA COG 2308 
Met His Cys Arg Gly Trp Lys Ser Val Tyr Cys Val Pro Lys Arg Pro 

750 755 760 

GCA TTC AAA GGG TCC GCT CCA ATC AAT CTC TOG GAT OGG TTG CAC CAA 2356 
Ala Phe Lys Gly Ser Ala Pro He Asn Leu Ser Asp Arg Leu His Gin 

765 770 775 

CTT TTG AGA TGG GCA CTT GGT TCT GTA GAA ATT TTC CTT AGT CGT CAC 2404 
Val Leu Arg Trp Ala Leu Gly Ser Val Glu He Phe Leu Ser Arg His 

780 785 790 

TGT OCA CTT TGG TAT GGT TAT GGT GGA AAA CTG AAA TGG CTC GAG AGG 2452 
Cys Pro Leu Trp Tyr Gly Tyr Gly Gly Lys Leu Lys Trp Leu Glu Arg 
795 800 805 810 
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CTT OCT TAT ATC AAC AOC ATT GTT TAC OCT TTC AOC TOG ATC OCT TTA 2500 
Leu Ala Tyr lie Asn Thr He val Tyr Pro Phe "Kir Ser He Pro Ijeu 

815 820 825 

CTC GOC TAT TCT ACT ATT OCA OCT GTT TGT CTT CIC AOC GGC AAA TIC 2548 
Leu Ala Tyr Cys Thr lie Pro Ala Val Cys Leu Leu Thr Gly Lys Fhe 

830 835 840 

ATC ATT OCA ACT CTA AOC AAC CTT ACA AGT GTG TOG TTC TIG GCA CTT 2596 

lie He Pro Thr Leu Ser Asn Leu Thr Ser Val Trp Phe Lai Ala Leu 

845 850 855 

TTC CTC T0C ATC ATT GCA ACT GGA GIG CTT GAA CTT OGA TGG AGC GGG 2644 
Phe Leu Ser He He Ala Thr Gly Val Leu Glu Lai Arg Trp Ser Gly 

860 865 870 

CTT AGC ATC CAA GAC TGG TOG OGC AAT GAA CAA TTC TOG GTG ATC GGA 2692 
Val Ser He Gin Asp Trp Trp Arg Asn Glu Gin Fhe Trp Val He Gly 
875 880 885 890 

GGT GTC TOC GOC CAT CTT TTT OCT GTC TTC CAG GGC CTC CTC AAA GTC 2740 
Gly Val Ser Ala His Leu Phe Ala Val Phe Gin Gly Leu Leu Lys Val 

895 900 905 

CTA GCT GGA CTA GAC AOC AAC TTC AOC CTA ACA GCA AAA GCA GCA GAC 2788 
Leu Ala Gly Val Asp Thr Asn fhe Thr Val Thr Ala Lys Ala Ala Asp 

910 915 920 

GAT ACA GAA TTC GGT GAA CTT TAT CTC TTC AAA TGG ACA ACT CTC TTA 2836 
Asp Thr Glu Phe Gly Glu Leu Tyr Leu Phe Lys Trp Thr Thr Lai Leu 

925 930 935 

ATC OCT OOC ACA ACT CTG ATA ATA CTG AAC ATG GTC GGA GTC GTG GOC 2884 
He Pro Pro Thr Thr Leu He He Leu Asn Met Val Gly Val Val Ala 

940 945 950 

GGA GTT TCA GAC GCA ATC AAC AAC GGC TAT GCT TCA TGG GGT OCA TTC 2932 

Gly Val Ser Asp Ala He Asn Asn Gly Tyr Gly Ser Trp Gly Pro Leu 

955 960 965 970 

TTC GGC AAA CTG TTC TTC GCA TTC TOG GTC ATT CTT CAT CTT TAC OCA 2980 

Phe Gly Lys Leu Fhe Phe Ala Ihe Trp Val He Leu His Leu Tyr Pro 

975 980 985 

TTC CTC AAA GGT TTC ATG GGG AGA CAA AAC AGG AOG OOC AOC ATT GTT 3028 
Phe Leu Lys Gly Leu Met Gly Arg Gin Asn Arg Thr Pro Thr lie Val 

990 995 1000 

CTG CTT TOG TOC ATA CTT TTC GCA TOG ATT TTC TCA CTG GTT TGG GTA 3076 
Val Leu Trp Ser He Leu Leu Ala Ser He Phe Ser Leu Val Trp Val 

1005 1010 1015 

OGG ATC GAT OOC TTC TTC OOC AAA CAA ACA GOT OCA GTT CTT AAA CAA 3124 
Arg He Asp Pro Fhe Leu Pro Lys Gin Thr Gly Pro Val Leu Lys Gin 
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1020 1025 1030 

TOT GGC GIG GAG TOC TAAATGCTGT TTTACAAAOC TTTCTTATTA TTTTATTTTC 3179 
Cys Gly Val Glu Cys 
1035 

aT l TlTO ACTACTGTTG ATTTGCTGTO ATTCTAAAAG GGATTTATCT TGTTTGTAAA 3239 
AftCnCTOCTA TGATTTTGTT GGTTCAATTT AATTTCTATA TGGTAAAAAA ATATTTCTTT 3299 
AAATTAACTA TA 



(2) INPC^mTION FOR SEQ ID NO: 4: 
(i) SBQUEKCE CHARACTERISTICS: 

(A) LENGTH: 1039 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(id.) SEQUEKCE DESCRIPTION: SEQ ID NO: 4: 
Met Ala Ser Thr Thr Met Ala Ala Gly Phe Gly Ser Leu Ala Val Asp 

15 10 15 

Glu Asn Arg Gly Ser Ser Thr His Gin Ser Ser Thr Lys lie Cys Arg 

20 25 30 

Val Cys Gly Asp Lys He Gly Gin Lys Glu Asn Gly Gin Pro Phe Val 

35 40 45 

Ala Cys His Val Cys Ala Phe Pro Val Cys Arg Pro Cys Tyr Glu Tyr 

50 55 60 

Glu Arg Ser Glu Gly Asn Gin Cys Cys Pro Gin Cys Asn Thr Arg Tyr 

65 70 75 80 

Lys Arg His Lys Gly Ser Pro Arg lie Ser Gly Asp Glu Glu Asp Asp 

85 90 95 

Ser Asp Gin Asp Asp Phe Asp Asp Glu Phs Gin He Lys Asn Arg Lys 

100 105 HO 

Asp Asp Ser His Pro Gin His Glu Asn Glu Glu Tyr Asn Asn Asn Asn 
115 120 125 

40 His Gin Trp His Pro Asn Gly Gin Ala Phs Ser Val Ala Gly Ser Thr 

130 135 140 

Ala Gly Lys Asp Leu Glu Gly Asp lys Glu He Tyr Gly Ser Glu Glu 
145 150 155 160 

45 Trp Lys Glu Arg Val Glu Lys Trp Lys Val Arg Gin Glu Lys Arg Gly 

165 170 175 

Leu Val Ser Asn Asp Asn Gly Gly Asn Asp Pro Pro Glu Glu Asp Asp 

180 185 190 

Tyr Leu Leu Ala Glu Ala Arg Gin Pro Leu Trp Arg Lys Val Pro He 
195 200 205 
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Ser Ser Ser Leu lie Ser Pro Tyr Arg lie Val lie Val Leu Arg Phe 

210 215 220 

Fhe lie Leu Ala Phe Phe Leu Arg Phe Arg lie Leu Thr Pro Ala Tyr 
225 230 235 240 

Asp Ala Tyr Pro Leu Trp Leu lie Ser Val lie Cys Glu Val Trp Phe 

245 250 255 

Ala Phe Ser Trp lie Leu Asp Gin Hie Pro Lys Trp Fhe Pro lie Thr 

260 265 270 

Arg Glu Thr Tyr Leu Asp Arg Leu Ser Leu Arg Phe Glu Arg Glu Gly 

275 280 285 

Glu Pro Asn Gin Leu Gly Pro Val Asp Val Phe Val Ser Thr Val Asp 

290 295 300 

Leu Leu Lys Glu Pro Pro He lie Thr Ala Asn Ala Val Leu Ser He 
305 310 315 320 

Leu Ala Val Asp Tyr Pro Val Glu Lys Val Cys Cys Tyr Val Ser Asp 

325 330 335 

Asp Gly Ala Ser Met Leu Leu Fhe Asp Ser Leu Ser Glu Thr Ala Glu 

340 345 350 

Phe Ala Arg Arg Trp Val Pro Fhe Cys Lys Lys His Asn Val Glu Pro 

355 360 365 

Arg Ala Pro Glu Fhe Tyr Phe Asn Glu Lys lie Asp Tyr Leu Lys Asp 

370 375 380 

Lys Val His Pro Ser Fhe Val Lys Glu Arg Arg Ala Met Lys Arg Glu 
385 390 395 400 

Tyr Glu Glu Fhe Lys Val Arg He Asn Ala Lai Val Ala Lys Ala Gin 

405 410 415 

Lys Lys Pro Glu Glu Gly Trp Val Met Gin Asp Gly Thr Pro Trp Pro 

420 425 430 

Gly Asn Asn Thr Arg Asp His Pro Gly Met He Gin Val Tyr Leu Gly 

435 440 445 

Ser Ala Gly Ala Leu Asp Val Asp Gly Lys Glu Leu Pro Arg Leu Val 

450 455 460 

Tyr Val Ser Arg Glu Lys Arg Pro Gly Tyr Gin His His Lys Lys Ala 
465 470 475 480 

Gly Ala Glu Asn Ala Leu Val Arg Val Ser Ala Val Leu Thr Asn Ala 

485 490 495 

Pro Fh& He Leu Asn Leu Asp Cys Asp His Tyr He Asn Asn Ser Lys 

500 505 510 

Ala Met Arg Glu Ala Met Cys Phe Leu Met Asp Pro Gin Phe Gly Lys 

515 520 525 

Lys Lai Cys Tyr Val Gin Phe Pro Gin Arg Phe Asp Gly He Asp Arg 
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530 535 540 

His Asp Arg Tyr Ala Asn Arg Asn Val Val Phe Phe Asp II Asn Met 
545 550 555 560 

Leu Gly Leu Asp Gly Leu Gin Gly Pro Val Tyr Val Gly Thr Gly Cys 

565 570 575 

Val Phe Asn Arg Gin Ala Leu Tyr Gly Tyr Asp Pro Pro Val Ser Glu 

580 585 590 

Lys Arg Pro Lys Met Thr Cys Asp Cys Trp Pro Ser Trp Cys Cys Cys 

595 600 605 

Cys Cys Gly Gly Ser Arg Lys Lys Ser Lys Lys Lys Gly Glu Lys Lys 

610 615 620 

Gly Leu Leu Gly Gly Leu Leu Tyr Gly Lys Lys Lys Lys Met Met Gly 
625 630 635 640 

Lys Asn Tyr Val Lys Lys Gly Ser Ala Pro Val Phe Asp Leu Glu Glu 

645 650 655 

lie Glu Glu Gly Leu Glu Gly Tyr Glu Glu Leu Glu Lys Ser Thr Leu 

660 665 670 

Met Ser Gin Lys Asn Phe Glu Lys Arg Phe Gly Gin Ser Pro Val Phe 

675 680 685 

lie Ala Ser Thr Leu Met Glu Asn Gly Gly Leu Pro Glu Gly Thr Asn 

690 695 700 

Ser Thr Ser Leu He Lys Glu Ala He His Val He Ser Cys Gly Tyr 
705 710 715 720 

Glu Glu Lys Thr Glu Trp Gly Lys Glu He Gly Trp He Tyr Gly Ser 

725 730 735 

Val Thr Glu Asp He Leu Thr Gly Phe Lys Met His Cys Arg Gly Trp 

740 745 750 

Lys Ser Val Tyr Cys Val Pro Lys Arg Pro Ala Phe Lys Gly Ser Ala 

755 760 765 

Pro He Asn Leu Ser Asp Arg Leu His Gin Val Leu Arg Trp Ala Leu 

770 775 780 

Gly Ser Val Glu He Phe Lai Ser Arg His Cys Pro Leu Trp Tyr Gly 
785 790 795 800 

Tyr Gly Gly Lys Leu Lys Trp Leu Glu Arg Leu Ala Tyr He Asn Thr 

805 810 815 

He Val Tyr Pro Fhe Thr Ser He Pro Leu Leu Ala Tyr Cys Thr He 

820 825 830 

Pro Ala Val Cys Leu Leu Thr Gly Lys Phe He He Pro Thr Leu Ser 

835 840 845 

Asn Leu Thr Ser Val Trp Phe Leu Ala Leu Phe Leu Ser He He Ala 
850 855 860 
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Tfcr Gly Val Leu Glu Lai Arg Trp Ser Gly Veil Ser lie Gin Asp Trp 
865 B70 875 880 

Trp Arg Asn Glu Gin Phe Trp Val lie Gly Gly Val Ser Ala His Leu 

885 890 895 

Phe Ala Val Fbe Gin Gly Leu Leu Lys Val Leu Ala Gly Val Asp Thr 

900 905 910 

Asn Phe Thr Val Thr Ala Lys Ala Ala Asp Asp Thr Glu Fte Gly Glu 

915 920 925 

Leu Tyr Leu Phe Lys Trp Thr Thr Leu Leu lie Pro Pro thr Thr Leu 

930 935 940 

lie He Leu Asn Met Val Gly Val Val Ala Gly Val Ser Asp Ala lie 
945 950 955 960 

Asn Asn Gly Tyr Gly Ser Trp Gly Pro Leu Phe Gly Lys Leu F*e Ph& 

965 970 975 

Ala Phe Trp Veil He Leu His Leu Tyr Pro Phe Leu Lys Gly Leu Met 

980 985 990 

Gly Arg Gin Asn Arg Thr Pro Thr He Val Val Leu Trp Ser He Leu 

995 1000 1005 

Leu Ala Ser He Phe Ser Leu Val Trp Val Arg He Asp Pro Phe Leu 

1010 1015 1020 

Pro Lys Gin Thr Gly Pro Val Leu Lys Gin Cys Gly Val Glu Cys 
1025 1030 1035 

(2) INFORMATION FOR SBQ ID NO: 5: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2033 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Gossypium hirsutum L. 

(C) INDIVIDUAL ISOLATE: Coker312 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . , 1857 

(xi) SEQUENCE INSCRIPTION: SEQ ID NO: 5: 
COG ACA TXC CTG AAG GAG OGT OGA GCT ATG AAG AGA GAA TAT GAA GAA 48 
Pro Thr Phe Val Lys Glu Arg Arg Ala Met Lys Arg Glu Tyr Glu Glu 

15 10 15 

TPC AAG GTT AGG ATA AAT GCA CTT GTA GOC AAA GOC CAA AAG GIT OCT 96 
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Phe Lys val Arg lie Asn Ala Leu Val Ala Lys Ala Gin Lys Val Pro 

20 25 30 

CCA GAA GGG TGG ATC ATG CAA GAT GGG ACA OCA TGG CCA GGA AAC AAT 144 
Pro Glu Gly Trp lie Mat Gin Asp Gly Tftr Pro Trp Pro Gly Asn Asn 

35 40 45 

ACT AAA GAT CAC OCT GGT ATG ATT CAA GTA TTT CTC GCT CAA ACT GC^ 192 
Thr Lys Asp His Pro Gly Met lie Gin Val Phe Lai Gly Gin Ser Gly 

50 55 60 

GQC CAT GAT ACC GGA AAT GAG CTT OCT CGT CTC GTC TAT GTA TCT 240 
Gly His Asp Thr Glu Gly Asn Glu Leu Pro Arg Leu Val Tyr Val Ser 

65 70 75 80 

GGA &G AAA AGG OCT GGT TTC TTG CAT CAC AAG AAA GCT GCT GOC ATG 288 
Arg Glu Lys Arg Pro Gly Phe Leu His His Lys Lys Ala Gly Ala Met 

85 90 95 

AAC GOC CTT GTT COG GTC TOG GGG GTG CTC ACA AAT GCT OCT TTT ATG 336 
Asn Ala Leu Val Arg Val Ser Gly Val Leu Thr Asn Ala Pro Phe Met 

100 105 110 

TTO AAC TTG GAT TGT GAC CAT TAT TTA AAT AAC AGC AAG GCT GTA AGA 384 
Leu Asn Leu Asp Cys Asp His Tyr Leu Asn Asn Ser Lys Ala Val Arg 

115 120 125 

GAG GCT ATG TCT TTC TTG ATG GAC OCT CAA ATT GGA AGA AAG CTT TGC 432 
Glu Ala Met Cys Pte Leu Met Asp Pro Gin He Gly Arg Lys Val Cys 

130 135 140 

TAT GTC CAA TTC OCT CAA OCT TTC GAT GCT ATT GAT AGA CAT ©VT OGA 480 
Tyr Val Gin Phe Pro Gin Arg Phe Asp Gly lie Asp Arg His Asp Arg 
145 150 155 160 

TAT GOC AAT OGG AAC ACA CTT TTC TTT GAT ATT AAC ATG AAA GGT CTA 528 
Tyr Ala Asn Arg Asn Thr Val Phe Phe Asp lie Asn Met Lys Gly Leu 

165 170 175 

GAT GCT ATA CAA GGC OCT GTA TAT GTC GGC AGG GGG TGT GTT TTC AGA 576 
Asp Gly He Gin Gly Pro Val Tyr Val Gly Thr Gly Cys Val Phe Arg 

180 185 190 

AGG CAA GCT CTT TAT GCT TAT GAA OCT CCA AAG GGA OCT AAG CTC COG 624 
Arg Gin Ala Leu Tyr Gly Tyr Glu Pro Pro Lys Gly Pro Lys Arg Pro 

195 200 205 

AAA ATG GTA ACC TOT GGT TGC TGC OCT TCT TTT GGA CGC CGC AGA AAG 672 
lys Met Val Thr Cys Gly Cys Cys Pro Cys Phe Gly Arg Arg Arg Lys 

210 215 220 

GAC AAA AAG CAC TCT AAG GAT GCT GGA AAT GCA AAT GCT CTA AGC CTA 720 
Asp Lys Lys His Ser Lys Asp Gly Gly Asn Ala Asn Gly Leu Ser Lai 
225 230 235 240 
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GAA GCA OCX AAA GAT GAC AAG GAB TEA TTO ATG TOC CAC ATG AAC TOT 768 
Glu Ala Ala Lys Asp Asp Lys Glu Leu Leu Met Ser His Met Asn Phe 

245 250 255 

GAA AAG AAA TTT GGA CAA TCA GOC ATT TTP GTA ACT TCA ACA CTC AUG 816 
Glu Lys lys Fhe Gly Gin Ser Ala He Fhe Val Thr Ser Thr Leu Met 

260 265 270 

GAA CAA OCT GOT CTC OCT OCT TCT TCA AGC COC GCA OCT TTG CTC AAA 864 
Glu Gin Gly Gly Val Pro Pro Ser Ser Ser Pro Ala Ala Lai Leu Lys 

275 280 285 

GAA GOC ATT CAT GTA ATT ACT TCT OCT TAT GAA GAC AAA ACA GAA TOG 912 
Glu Ala He His Val lie Ser Cys Gly Tyr Glu Asp Lys Thr Glu Trp 

290 295 300 

GGA AGC GAG CTT GGC TGG ATT TAC GOC TOG ATT ACA GAA GAT ATC TTA 960 
Gly Ser Glu Leu Gly Trp lie Tyr Gly Ser lie Thr Glu Asp He Lai 
305 310 315 320 

ACA GGA TIC AAG ATG CAT TGC OCT GGA TGG AGA TCA ATA TAC TGC ATG 1008 
Thr Gly Fhe lys Met His Cys Arg Gly Trp Arg Ser lie Tyr Cys Met 

325 330 335 

OCA AAG TTG OCT GCA TTC AAG GGT TCA GCT OOC ATC AAT CTA TOG GAT 1056 
Pro Lys Leu Pro Ala Phe Lys Gly Ser Ala Pro lie Asn Leu Ser Asp 

340 345 350 

OCT CTA AAC CAA CTC CTT OGA TGG GCA CTC GCT TCT GTT GAA ATT TTC 1104 
Arg leu Asn Gin Val leu Arg Trp Ala Leu Gly Ser Val Glu He Fte 

355 360 365 

TTT ACT CAT CAT TGC OCA GCA TGG TAT GCT TTC AAG GGA GGA AAG CTA 1152 
Phe Ser His His Cys Pro Ala Trp Tyr Gly Phe Lys Gly Gly Lys Leu 
370 375 380 

35 AAA TGG CTT GAA OGA TTC GCA TAT GTC AAC ACA ACC ATC TAC OOC TTC 1200 

Lys Trp leu Glu Arg Phe Ala Tyr Val Asn Thr Thr He Tyr Pro Phe 
385 390 395 400 

ACA TCT TTA CCA CTT CTC GOC TAT TCT AGC CTA COG GCA ATC TCT TTA 1248 
Thar Ser leu Pro leu Leu Ala Tyr Cys thr Leu Pro Ala He Cys Lai 

405 410 415 

CTT ADC GAT AAA TTT ATC ATG OCA 00G ATA AGC ACC TTT OCA ACT CTA 1296 
leu Thr Asp Lys Phe He Met Pro Pro He Ser Thr Phe Ala Ser Leu 

420 425 430 

TTC TIC ATT GOC TTG TTT CTT TCA ATC TTT GCA ACT GOT ATT CTC GAG 1344 
Phe Phe He Ala Leu Phe leu Ser He Fhe Ala Thr Gly He leu Glu 

435 440 445 

CTA AGG TGG ACT GGA CTA AGC ATT GAA GAA TGG TGG AGG AAT GAG CAA 1392 
leu Arg Ttp Ser Gly Val Ser lie Glu Glu Trp Trp Arg Asn Glu Gin 
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450 455 460 

TTT TGG GTC ATC GGT GGC ATT TOG GCA CAT TTG TTC GCT GTT ATC CAA 1440 
Ffca Trp Val lie Gly Gly lie Ser Ala His Leu Fhe Ala Val lie Gin 
465 470 475 480 

GGC TTG TTG AAA GTT CTA GCT GGT ATT GAC ACT AAT TTC ACT GTC ACA 1488 
Gly Lai Leu Lys Val Leu Ala Gly lie Asp Thr Asn Whs Thr Val Thr 

485 490 495 

TOC AAG GCA ACT GAT GftC GAG GAG TTC GGG GAA TTG TAT ACT TTC AAA 1536 
Ssr Lys Ala Thr Asp Asp Glu Glu Fhe Gly Glu Leu Tyr Thr Pha Lys 

500 505 510 

TOT ACA ACC CTT CTA ATT OCT OCT ACT ACC GTC TTA ATC ATC AAT TTA 1584 
Trp Thr Thr Leu Leu lie Pro Pro Thr Thr Val Leu He lie Asn Leu 

515 520 525 

GTC GGT GTC GTT GCA GGC ATC TOG GAT GCC ATA AAC AAT GGA TAC CAA 1632 
Val Gly Val Val Ala Gly He Ser Asp Ala He Asn Asn Gly Tyr Gin 

530 535 540 

TCA TGG GGA OCT CTT TTT GGG AAG CTC TTC TTC TCT TTC TGG CTG ATT 1680 
Ssr Trp Gly Pro Leu Fhe Gly Lys Leu Phe Fhe Ser Fhe Trp Val He 
545 550 555 560 

25 GTC CAT CTC TAT CCA TTC CTC AAA GCT TTA ATG GGG AGA CAA AAC CGG 1728 

Val His Leu Tyr Pro Phe Leu Lys Gly Leu Met Gly Arg Gin Asn Arg 

565 570 575 

ACA OCA AOC ATT GTT GTT ATA TGG TCA GIG CTA TTG GCT TCA ATC TTC 1776 
30 The Pro Thr He Val Val He Trp Ser Val Leu Leu Ala Ser He Fhe 

580 585 590 

TOC TTG CTT TGG GTC OGA ATT GAT OCA TTT GTC ATC AAA ACC AAA GGA 1824 
Ser Leu Leu Trp Val Arg He Asp Pro Fhe Val Met Lys Thr Lys Gly 
35 595 600 605 

OCA GftC ACT ACA ATG TCT GGC ATT AAC TCP TCAAAAAAAA TCATCTTOOG 1874 
Pro Asp Thr Thr Met Cys Gly He Asn Cys 
610 615 
40 'I^UTICITIT AGATTATGCT ATGTGATCTA TCAACAAACA AGAATQGAGA TOCACAAGAC 1934 

AGAATAAAAT TAGACTGAAA GTTTTGTGTA GTTATATATT CATTCTAOCA ACTATAAGTT 1994 
TTGTCATTCA ATTOAAAATA GCTCAACTTT GTCATCAAA 2033 

45 (2) INFORMATION FOR SBQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£NGTH: 618 amino adds 

(B) TYPE: amino acid 
so (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(v) FRAGMENT TYPE: C-terminal fragment 
(xi) SBQUQCE DESCRIPTION; SEQ ID NO: 6: 
Pro Thr Fhe Val Lys Glu Arg Arg Ala Met Lys Arg Glu Tyr Glu Glu 

1 5 10 15 

Fhe Lys Val Arg lie Asn Ala Leu Val Ala Lys Ala Gin Lys Val Pro 

20 25 30 

Pro Glu Gly Tip lie Met Gin Asp Gly Thr Pro Trp Pro Gly Asn Asn 

35 40 45 

Thr Lys Asp His Pro Gly Met He Gin Val Fhe Lai Gly Gin Sear Gly 

50 55 60 

Gly His Asp Thr Glu Gly Asn Glu Leu Pro Arg Leu Val Tyr Val Ser 

65 70 75 80 

Arg Glu Lys Arg Pro Gly Phe Leu His His Lys Lys Ala Gly Ala Met 

85 90 95 

Asn Ala Leu Val Arg Val Ser Gly Val Leu Thr Asn Ala Pro Phe Met 

100 105 110 

leu Asn Lai Asp Cys Asp His Tyr Leu Asn Asn Ser Lys Ala Val Arg 

115 120 125 

Glu Ala Met Cys Phe Leu Met Asp Pro Gin He Gly Arg Lys Val Cys 

130 135 140 

Tyr Val Gin Phe Pro Gin Arg Phe Asp Gly lie Asp Arg His Asp Arg 
145 150 155 160 

Tyr Ala Asn Arg Asn Thr Val Phe Phe Asp He Asn Met Lys Gly Leu 

165 170 175 

Asp Gly He Gin Gly Pro Val Tyr Val Gly Thr Gly Cys Val Phe Arg 

180 185 190 

Arg Gin Ala Leu Tyr Gly Tyr Glu Pro Pro Lys Gly Pro Lys Arg Pro 

195 200 205 

Lys Met Val Thi; Cys Gly Cys Cys Pro Cys Phe Gly Aig Arg Arg Lys 

210 215 220 

A^p Lys Lys His Ser Lys Asp Gly Gly Asn Ala Asn Gly Leu Ser Leu 
225 230 235 240 

Glu Ala Ala Lys Asp Asp Lys Glu Lai Leu Met Ser His Met Asn Phe 

245 250 255 

Glu Lys Lys Fhe Gly Gin Ser Ala lie Phe Val Thr Ser Thr Leu Met 

260 265 270 

Glu Gin Gly Gly Val Pro Pro Ser Ser Ser Pro Ala Ala Lai Lai Lys 

275 280 285 

Glu Ala He His Val He Ser Cys Gly Tyr Glu Asp Lys Thr Glu Trp 

290 295 300 

Gly Ser Glu Lai Gly Trp He Tyr Gly Ser He Thr Glu Asp He Leu 
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305 310 315 320 

Thr Gly Phe Lys Met His Cys Arg Gly Trp tog Ser He Tyr Cys Met 

325 330 335 

Pro Lys Leu Pro Ala Hie Lys Gly Ser Ala Pro He Asn. Lai Ser Asp 

340 345 350 

Arg Leu Asn Gin Val Leu Arg Trp Ala Leu Gly Ser Val Glu lie Phe 

355 360 365 

Pfre ser His His Cys Pro Ala Trp Tyr Gly Phe Lys Gly Gly Lys Leu 

370 375 380 

Lys Trp Leu Glu Arg Phe Ala Tyr Val Asn Thr Thr He Tyr Pro Phe 
« 385 390 395 400 

Thr Ser Leu Pro Leu Leu Ala Tyr Cys Thr Leu Pro Ala He Cys Leu 

405 410 415 

Leu Thr Asp Lys Phe He Met Pro Pro He Ser Thr Phe Ala Ser Leu 

420 425 430 

Phe Phe lie Ala Leu Phe Leu Ser He Phe Ala Thr Gly He Leu Glu 

435 440 445 

Leu Arg Trp Ser Gly Val Ser He Glu Glu Trp Trp Arg Asn Glu Gin 

450 455 460 

Phe Trp Val He Gly Gly He Ser Ala His Leu Phe Ala Veil He Gin 
465 470 475 480 

Gly Leu Leu Lys Val Leu Ala Gly He Asp Thr Asn Phe Thr Val Thr 

485 490 495 

Ser Lys Ala Thr Asp Asp Glu Glu Phe Gly Glu Leu Tyr Thr Phe Lys 
500 505 510 

3s Trp Thr Thr Leu Leu He Pro Pro Thr Thr Val Leu He He Asn Leu 

515 520 525 

Val Gly Val Val Ala Gly lie Ser Asp Ala He Asn Asn Gly Tyr Gin 

530 535 540 

Ser Trp Gly Pro Leu Phe Gly Lys Lai Phe Phe Ser Phe Trp Val He 
545 550 555 560 

Val His Leu Tyr Pro Phe Leu Lys Gly Leu Met Gly Arg Gin Asn Arg 

565 570 575 

Thr Pro Thr He Val Val He Trp Ser Val Leu Leu Ala Ser He Phe 

580 585 590 

Ser Leu Leu Trp Val Arg He Asp Pro Hie Val Met Lys Thr Lys Gly 

595 600 605 

Pro Asp Thr Thr Met Cys Gly He Asn Cys 
610 615 

ss (2) INPOI^BVTION FOR SBQ ID NO: 7: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1086 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 
(11) M0I£0UI£ TYPE: CDNA to ratRNA 
(vl) ORIGINAL SOURCE: 

(A) ORGANISM: Gossypium hirsuturo L. 
(C) INDIVIDUAL ISOLATE: Cdker312 

(ix) FEATURE: 
15 (A) NAME/KEY: CDS 

(B) LOCATION: 24.. 1086 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGCAOGAGCT TTCATATCCT CCA AUG GAA GOC AGC GOC GGA CTC GTT GOG 50 
20 Met Glu Ala Ser Ala Gly Leu Val Ala 

1 5 

GOC TCT CAC AAC CGC AAT GAA CTT GIT GTC ATT CAT GOC CAT GAA GAG 98 
Gly Ser His Asn Aug Asn Glu Leu Val Val He His Gly His Glu Glu 

10 15 20 25 

OCT AAA OCT CTG AAG AAC TTG GAT GGT CAA GTT TGT GAG ATT TCT GGT 146 
Pro Lys Pro Leu Lys Asn Leu Asp Gly Gin Val Cys Glu lie Cys Gly 

30 35 40 

GAT GAA ATT GGG TTG ACG CTC GAT GGA GAT CTT TTC GTC GOC TGC AAC 194 
Asp Glu He Gly Leu Thr Val Asp Gly Asp Leu Phe Val Ala Cys Asn 

45 50 55 

GAG TGT GGT TTT CCA CTT TCT AGG OCT TCT TAT GAG TAT GAA AGG AGA 242 
Glu Cys Gly Phe Pro Val Cys Arg Pro Cys Tyr Glu Tyr Glu Arg Arg 

60 65 70 

GAA GGG ACT CAA CAA TGT OCT CAA TGC AAA ACT AGA TAC AAG OCT CTC 290 
Glu Gly Ser Gin Gin Cys Pro Gin Cys Lys Thr Arg Tyr lys Arg Leu 

75 80 85 

AAG GGG ACT OOG AGG GTG GAG GGA GAT GAA GAT GAA GAG GAT GTC GAT 338 
Lys Gly Ser Pro Arg Val Glu Gly Asp Glu Asp Glu Glu Asp Val Asp 

90 95 100 105 

GAT ATC GAA CAT GAA TTC AAC ATT GAT GAT GAA CAA AAC AAG TAT AGA 386 
Asp lie Glu His Glu Phe Asn He Asp Asp Glu Gin Asn Lys Tyr Arg 

110 115 120 

AAT ATC GCT GAA TOG ATC CTT CAT GGA AAG ATG AGC TAC GGG AGA GOC 434 
Asn He Ala Glu Ser Met Leu His Gly Lys Met Ser Tyr Gly Arg Gly 

125 130 135 

OCT GAA GAC GAT GAA GCT TTC CAA ATC CCA COC GGT TEA GCT GOT GTT 482 
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Pro Glu Asp Asp Glu Gly Leu Gin He Pro Pro Gly Leu Ala Gly Val 

140 145 150 

OCm TCT OGG COG GTG AGC GGG (3>G TTC OCA ATA GGA AGC TCT CTT GCT 530 
Arg Ser Arg Pro Val Ser Gly Glu Phe Pro lie Gly Ser Ser Lai Ala 

155 160 165 

TAT GGG GAA CAC ATG TCA AAT AAA OGA CTT CAT OCA TAT OCT ATG TCT 578 
Tyr Gly Glu His Met Ser Asn Lys Arg Val His Pro Tyr Pro Met Ser 
170 175 180 185 

GAA OCT GGA AGT GCA AGA TGG GAT (3*A AAG AAA GAG GC3* GGA TGG AGA 626 
Glu Pro Gly Ser Ala Arg Trp Asp Glu Lys Lys Glu Gly Gly Trp Arg 

190 195 200 

GAA AGG ATG GAT GAT TGG AAA ATG CAG CAA GGG AAT TTG GGT OCT GAA 674 
Glu Arg Met Asp Asp Trp Lys Met Gin Gin Gly Asn Leu Gly Pro Glu 
205 210 215 

ocr car gat ax tat gat gct gac atg gct atg err gat gaa gct agg 722 

Pro Asp Asp Ala Tyr Asp Ala Asp Met Ala Met Leu Asp Glu Ala Arg 

220 225 230 

CAG OCA TTG TCA AGG AAA GTG OCA ATT GCA TOG AGC AAA ATC AAT OCT 770 
Gin Pro Leu Ser Arg Lys Val Pro He Ala Ser Ser Lys He Asn Pro 
25 235 240 245 

TAT 0GT ATG GTG ATT GTG GCT OCT CTA CTT ATC CTT GCT TTC TTT CTT 818 
Tyr Arg Met Val lie Val Ala Arg Leu Val He Leu Ala Phe Phe Leu 
250 255 260 265 

30 CGC TAT CGG ATT TTG AAC CCG GTA CAT GAT GCA ATT GGG CTT TGG CTA 866 

Arg Tyr Arg He Leu Asn Pro Val His Asp Ala He Gly Leu Trp Leu 

270 275 280 

ACT TCT GTG ATC TGT GAA ATC TGG TTT GGC TTT TCA TGG ATC CTT GAT 914 
35 Thr Ser Val He Cys Glu He Trp Fhe Ala Phe Ser Trp He Leu Asp 

285 290 295 

CAG TTC OCT AAA TGG TTC OCT ATT GAC CGC GAG AOG TAT CTC GAT CGC 962 
Gin Rie Pro Lys Trp Phe Pro He Asp Arg Glu Thr Tyr Leu Asp Arg 
40 300 305 310 

CTT T0C CTC AGG TAT GAG AGG GAA GCT GAG 00C AAC ATG CTT GCT TCT 1010 
Leu Ser Leu Arg Tyr Glu Arg Glu Gly Glu Pro Asn Met Leu Ala Ser 
315 320 325 

45 CTT GAT ATT TTT GTC AGT ACA GTG GAT CCA TTG AAG GGA OCT OCT CTA 1058 

Val Asp He Fhe Val Ser Thr Val Asp Pro Leu Lys Gly Pro Pro Leu 
330 335 340 345 

GTA ACA GOG AAT ACA CTT CTA TOG ATC T 1086 
Val Thr Ala Asn Thr Val Leu Ser He 
350 
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(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 354 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: N- terminal fragment 
(ad.) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
Met Glu Ala Ser Ala Gly Leu Val Ala Gly Ser His Asn Arg Asn Glu 

15 10 15 

Leu Val Val lie His Gly His Glu Glu Pro Lys Pro Leu Lys Asn Leu 

20 25 30 

Asp Gly Gin Val Cys Glu lie Cys Gly Asp Glu lie Gly Leu Thr Val 

35 40 45 

Asp Gly Asp Leu Phe Val Ala Cys Asn Glu Cys Gly Phe Pro Val Cys 

50 55 60 

Arg Pro Cys Tyr Glu Tyr Glu Arg Arg Glu Gly Ser Gin Gin Cys Pro 

65 70 75 80 

Gin Cys Lys Thr Arg Tyr Lys Arg Leu Lys Gly Ser Pro Arg Val Glu 

85 90 95 

Gly Asp Glu Asp Glu Glu Asp Val Asp Asp He Glu His Glu Phe Asn 

100 105 110 

He Asp Asp Glu Gin Asn Lys Tyr Arg Asn He Ala Glu Ser Met Leu 

115 120 125 

His Gly Lys Met Ser Tyr Gly Arg Gly Pro Glu Asp Asp Glu Gly Leu 

130 135 140 

Gin He Pro Pro Gly Leu Ala Gly Val Arg Ser Arg Pro Val Ser Gly 
145 150 155 160 

Glu Phe Pro He Gly Ser Ser Leu Ala Tyr Gly Glu His Met Ser Asn 

165 170 175 

Lys Arg Val His Pro Tyr Pro Met Ser Glu Pro Gly Ser Ala Arg Tip 

180 185 190 

Asp Glu Lys Lys Glu Gly Gly Trp Arg Glu Arg Met Asp Asp Trp Lys 

195 200 205 

Met Gin Gin Gly Asn Leu Gly Pro Glu Pro Asp Asp Ala Tyr Asp Ala 

210 215 220 

Asp Met Ala Met Leu Asp Glu Ala Arg Gin Pro Leu Ser Arg Lys Val 

225 230 235 240 

Pro He Ala Ser Ser Lys He Asn Pro Tyr Arg Met Val lie Val Ala 
245 250 255 
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Arg Leu Val lie Leu Ala Fhe Phe Leu Arg Tyr Arg lie Leu Asn Pro 

260 265 270 

Val His Asp Ala lie Gly Leu Trp Leu Thr Ser Val lie Cys Glu lie 

275 280 285 

Trp Fhe Ala Ffrs Ser Trp lie Leu Asp Gin Phe Pro Lys Trp Fhe Pro 

2S0 295 300 

lie Asp Arg Glu Thr Tyr Leu Asp Arg Leu Ser Leu Arg Tyr Glu Arg 
305 310 315 320 

Glu Gly Glu Pro Asn Met Leu Ala Ser Val Asp lie Phs Val Ser Thr 

325 330 335 

Val Asp Pro Leu Lys Gly Pro Pro Leu Val Thr Ala Asn Thr Val Leu 
340 345 350 

Ser He 



20 (2) INFCSSflVITCN FOR SBQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1000 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA "to ntf*NA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Gossypium hirsutum L. 
(C) INDIVIDUAL ISOLATE: Coker312 
(±k) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION : 1 . .1000 
(si) SEQUENCE DESCRIPTION; SEQ ID NO: 9: 

GAC AAA OIC CGG COG ACA TTC GTG AAG GftG CGT CGA OCT ATG AAG AGA 48 
Asp Lys Val Arg Pro Hir Phe Val Lys Glu Arg Arg Ala Met Lys Arg 

15 10 15 

GAA TAT GAA GAA TTC AAG GTT AGG ATA AAT GCA CTT GTA GCC AAA GOC 96 
Glu Tyr Glu Glu Phe Lys Val Arg lie Asn Ala Leu Val Ala Lys Ala 

20 25 30 

CAA AAG GTT OCT OCA GAA GGG TGG ATC ATG CAA GAT GGG ACA CCA TGG 144 
Gin Lys Val Pro Pro Glu Gly Trp lie Met Gin Asp Gly Thr Pro Trp 

35 40 45 

OCA GGA AAC AAT ACT AAA GAT CAC OCT GGT ATG ATT CAA GTA TTT CTC 192 
Pro Gly Asn Asn Thr Lys Asp His Pro Gly Met lie Gin Val Phe Leu 
50 50 55 60 
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10 



15 



20 



25 



GCT CAA ACT GGA GGC CAT GAT AOC GAA GGA AAT GAG CTT OCT OCT CIC 240 
Gly Gin Ser Gly Gly His Asp Thr Glu Gly Asn Glu Leu Pro Arg Lai 
* 65 70 75 80 

GTC TAT GTA TCT OGA GAG AAA AOG OCA OCT TIC TTG CAT CAC AAG AAA 288 
Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Phe Leu His His Lys Lys 

85 90 95 

GOT GOT GOC ATG AAC GOC CTT CTT OCT GTC TOG GOG GIG CTP ACA AAT 336 
Ala Gly Ala Met Asn Ala Leu Val Arg Val Ser Gly Val Leu Thr Asn 

100 105 110 

GCT OCT TTT ATG TIG AAC TTG GAT TCT GAC CAC TAT TTA AAT AAC AGC 384 
Ala Pro Phe Met Leu Asn Leu Asp Cys Asp His Tyr Leu Asn Asn Ser 

115 120 125 

AAG OCT CTA AGA GAG GCT ATG TCT TTC TTG ATG GAC OCT CAA ATT GGA 432 
Lys Ala Val Arg Glu Ala Met Cys Fhe Leu Met Asp Pro Gin He Gly 

130 135 140 

Am AAG CTT TGC TAT GTC CAA TIC OCT CAA OCT TTC GAT GCT ATT GAT 480 
Arg Lys Val Cys Tyr Val Gin Phe Pro Gin Arg Ihe Asp Gly He Asp 
145 150 155 160 

AGA CAT GAT OGA TAT GOC AAT OGG AAC ACA GIT TTC TTT GAT ATT AAC 528 
Arg His Asp Arg Tyr Ala Asn Arg Asn Thr Val Phe Pte Asp He Asn 

165 170 175 

ATG AAA GCT CTA GAT GCT ATA CAA GGC OCT GTA TAT GTC GGC AOG GGG 576 
Met Lys Gly Leu Asp Gly He Gin Gly Pro Val Tyr Val Gly Thr Gly 
30 180 185 190 

TCT GTT TTC AGA AGG CAA GCT CTT TAT GCT TAT GAA OCT OCA AAG GGA 624 
Cys Val Phe Arg Arg Gin Ala Leu Tyr Gly Tyr Glu Pro Pro Lys Gly 
195 200 205 

35 OCT AAG OGC COG AAA ATG CTA AOC TCT GCT TGC T9C OCT TGC TTT GGA 672 

Pro Lys Arg Pro Lys Met Val Thr Cys Gly Cys Cys Pro Cys Phe Gly 

210 215 220 

OGC OGC AGA AAG GAC AAA AAG CAC TCT AAG GAT GCT GGA AAT GCA AAT 720 
40 *ucg A*9 Arg Lys Asp Lys Lys His Ser Lys Asp Gly Gly Asn Ala Asn 

225 230 235 240 

GOT CTA AGC CTA GAA GCA GOC GAA GAT GAC AAG GAG TTA TTG ATG TCC 768 
Gly Leu Ser Leu Glu Ala Ala Glu Asp Asp Lys Glu Leu Leu Nfet Ser 

245 250 255 

CAC ATG AAC TTT GAA AAG AAA TTT GGA CAA TCA GOC ATT TTT CTA ACT 816 
His Met Asn Fhe Glu Lys Lys Phe Gly Gin Ser Ala He Phe Val Thr 

260 265 270 

TCA ACA CTG ATG GAA CAA GCT GCT GTC OCT OCT TCT TCA AGC OCT GCA 864 
Ser Thr Leu Met Glu Gin Gly Gly Val Pro Pro Ser Ser Ser Pro Ala 
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275 280 285 

OCT TTG CTC AAA GAA GOC ATT CAT GTA ATT ACT TCT GGT TAT GAA GAC 912 

Ala Leu Leu Lys Glu Ala lie His Val lie Ser Cys Gly Tyr Glu Asp 

290 295 300 

AAA AOC GAA TGG GGA AGC GAG CTT GGC TGG ATT TAG GGC TOG ATT ACA 960 
Lys Thr Glu Trp Gly Ser Glu Leu Gly Trp He Tyr Gly Ser lie Thr 
305 310 315 320 

G&A GAT ATC TTA ACA GGT TTC AAG ATG CAT TOC OCT GGA T 1000 
Glu Asp He Leu Thr Gly Phe Lys Met His Cys Arg Gly 
325 330 



(2) INPOS^TICN FOR SBQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 
20 (D) TOPOLOGY: linear 

(ii) miEOJLB TYPE: peptide 
(v) FRAGMENT TYPE: Internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID KO: 10: 
25 Asp Lys Val Arg Pro Thr Phe Val Lys Glu Arg Arg Ala Met Lys Arg 

15 10 15 

Glu Tyr Glu Glu Phe Lys Val Arg He Asn Ala Leu Val Ala Lys Ala 
20 25 30 

30 Gin Lys Val Pro Pro Glu Gly Trp He Met Gin Asp Gly Thr Pro Trp 

35 40 45 

Pro Gly Asn Asn, Thr Lys Asp His Pro Gly Met He Gin Val Fhe Leu 
50 55 60 

35 Gly Gin Ser Gly Gly His Asp Thr Glu Gly Asn Glu Leu Pro Arg Leu 

65 70 75 80 

Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Phe Leu His His Lys Lys 

85 90 95 

Ala Gly Ala Met Asn Ala Leu Val Arg Val Ser Gly Val Leu Thr Asn 

100 105 110 

Ala Pro Phe Met Leu Asn Leu Asp Cys Asp His Tyr Leu Asn Asn Ser 

115 120 125 

Lys Ala Val Arg Glu Ala Met: Cys Ffte Leu Met Asp Pro Gin He Gly 

130 135 140 

Arg Lys Val Cys Tyr Val Gin Phe Pro Gin Arg Phe Asp Gly He Asp 
145 150 155 160 

Arg His Asp Arg Tyr Ala Asn Arg Asn Thr Val Fhe Phe Asp He Asn 
165 170 175 
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Met Lys Gly Leu Asp Gly He Gin Gly Pro Val Tyr Val Gly Hhr Gly 

180 185 190 

Cys Val Phe Arg Arg Gin Ala Leu Tyr Gly Tyr Glu Pro Pro Lys Gly 

195 200 205 

Pro Lys Arg Pro Lys Met Val Thr Cys Gly cys Cys Pro Cys Phe Gly 

210 215 220 

Arg Arg Arg Lys Asp Lys Lys His Ser Lys Asp Gly Gly Asn Ala Asn 

225 230 235 240 

Gly teu Ser Leu Glu Ala Ala Glu Asp Asp Lys Glu Lai Leu fet Ser 

245 250 255 

His Met Asn Phe Glu Lys Lys Phe Gly Gin Ser Ala lie Phe Val Thr 

260 265 270 

Ser Utr Lai Met Glu Gin Gly Gly Val Pro Pro Ser Ser Ser Pro Ala 

275 280 285 

Ala Leu Lai Lys Glu Ala lie His Val He Ser Cys Gly Tyr Glu Asp 

290 295 300 

Lys Thr Glu Trp Gly Ser Glu Leu Gly Trp lie Tyr Gly Ser lie Thr 
305 310 315 320 

Glu Asp lie Leu Thr Gly Phe Lys Met His Cys Arg Gly 
325 330 

(2) INFORMATION FOR SEQ ID NO: 11: 
(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 622 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(11) MDlECUlfi TYPE: peptide 
(v) FRAGMENT TYPE: C-terainal fragment 
(ix) FEATURE: 
40 (A) NAME/KEY: 

(B) LOCATION: 

(D) OTHER INFORMATION: Xaa indicates Glu or Lys 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
Asp Lys Val Arg Pro Tttr Phe Val Lys Glu Aig Arg Ala Met Lys Arg 

1 5 10 15 

Glu Tyr Glu Glu Phe Lys Val Arg lie Asn Ala Lai Val Ala Lys Ala 

20 25 30 

Gin Lys Val Pro Pro Glu Gly Trp lie Met Gin Asp Gly Thr Pro Trp 

35 40 45 

Pro Gly Asn Asn Thr Lys Asp His Pro Gly Met lie Gin Val Phe Leu 
50 55 60 

55 
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Gly Gin Ser Gly Gly His Asp Thr Glu Gly Asn Glu Leu Pro Arg Leu 

65 70 75 80 

Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Hie Leu His His Lys Lys 

85 90 95 

Ala Gly Ala Met Asn Ala Lai Val Arg Val Ser Gly Val Lsu Thr Asn 

100 105 110 

Ala Pro Hie Met Leu Asn Leu Asp Cys Asp His Tyr Leu Asn Asn Ser 

115 120 125 

lys Ala Val Arg Glu Ala Met Cys Phe Lsu Met Asp Pro Gin lie Gly 

130 135 140 

Arg Lys Val Cys Tyr Val Gin Phe Pro Gin Arg Hie Asp Gly lie Asp 
145 150 155 160 

Arg His Asp Arg Tyr Ala Asn Arg Asn Thr Val Phe Phe Asp lie Asn 

165 170 175 

Met Lys Gly Lai Asp Gly He Gin Gly Pro Val Tyr Val Gly Thr Gly 

180 185 190 

Cys Val Phe Arg Arg Gin Ala Leu Tyr Gly Tyr Glu Pro Pro Lys Gly 

195 200 205 

Pro Lys Arg Pro Lys Met Val Thr Cys Gly Cys Cys Pro Cys Phe Gly 

210 215 220 

Arg Arg Arg Lys Asp Lys Lys His Ser Lys Asp Gly Gly Asn Ala Asn 
225 230 235 240 

Gly Leu Ser Leu Glu Ala Ala Xaa Asp Asp Lys Glu Leu Leu Met Ser 

245 250 255 

His Met Asn Hie Glu Lys Lys Phe Gly Gin Ser Ala He Hie Val Thr 

260 265 270 

Ser Thr Leu Met Glu Gin Gly Gly Val Pro Pro Ser Ser Ser Pro Ala 

275 280 285 

Ala Leu Leu Lys Glu Ala He His Val He Ser Cys Gly Tyr Glu Asp 

290 295 300 

Lys Thr Glu Trp Gly Ser Glu Leu Gly Trp He Tyr Gly Ser He Thr 
305 310 315 320 

Glu Asp He Leu Thr Gly Hie Lys Met His Cys Arg Gly Trp Arg Ser 

325 330 335 

He Tyr Cys Met Pro Lys Leu Pro Ala His Lys Gly Ser Ala Pro He 

340 345 350 

Asn Leu Ser Asp Arg Leu Asn Gin Val Leu Arg Trp Ala Leu Gly Ser 

355 360 365 

Val Glu He Phe Phe Ser His His Cys Pro Ala Trp Tyr Gly Phe Lys 

370 375 380 

Gly Gly Lys Lai Lys Trp Lai Glu Arg Hie Ala Tyr Val Asn Thr Thr 
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385 390 395 400 

He Tyr Pro Phe Ttir Ser Leu Pro Leu Leu Ala Tyr Cys Thr Leu Pro 

405 410 415 

Ala He Cys Leu Leu Thr Asp Lys Phe He Met Pro Pro He Ser Thr 

420 425 430 

Phe Ala Ser Leu Phe Phe lie Ala Leu Ffre Leu Ser He Phe Ala Thr 

435 440 445 

Gly lie Leu Glu Leu Arg Trp Ser Gly Val Ser He Glu Glu Trp Trp 

450 455 460 

Arg Asn Glu Gin Phe Trp Val He Gly Gly lie Ser Ala His Leu Phe 
465 470 475 480 

Ala Val He Gin Gly Leu Lai Lys Val Leu Ala Gly He Asp Thr Asn 

485 490 495 

Fhe Thr Val Thr Ser Lys Ala Thr Asp Asp Glu Glu Phe Gly Glu Leu 

500 505 510 

Tyr Thr Phe Lys Trp Thr Thr Leu Leu He Pro Pro Thr Thr Val Leu 

515 520 525 

He He Asn Leu Val Gly Val Val Ala Gly lie Ser Asp Ala He Asn 

530 535 540 

Asn Gly Tyr Gin Ser Trp Gly Pro Lai Phe Gly Lys Leu Fhe Pte Ser 
545 550 555 560 

Phe Trp Val He Val His Leu Tyr Pro Fhe Leu Lys Gly Leu Met Gly 

565 570 575 

Arg Gin Asn Arg Thr Pro Thr He Val Val He Trp Ser Val Lai Lai 

580 585 590 

Ala Ser He Phe Ser Leu Leu Trp Val Arg He Asp Pro Phe Val Met 

595 600 605 

Lys Thr Lys Gly Pro Asp Thr Thr Met Cys Gly He Asn Cys 
610 ^ 615 620 



(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) Ifi*3TH: 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: internal fracpnent 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
Gin Xaa Xaa Xaa Xaa Xaa Xaa Arg Trp 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DNA" 
(xi) SBQUQCE DESCRIPTION: SEQ ID NO: 13: 
GAGAGAGAGA GAGAGAGAGA ACTAGTCTOG WaTlTPlTl' TTTTTTTTTT 

(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DMA" 
(ix) FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 1 . . 4 

(D) OTHER INFORMATION: single strand 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
AATTCGOCAC GAG 

(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc « "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GACTOAAGAT AAOOCAAAAG 

(2) INFORMATION FOR SEQ ID NO: 16: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(il) MDI£CUI£ TORE: other nucleic arHri 

(A) DESCRIPTION: /desc - "Synthetic DMA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGAATGATGA ATITOOOGG 

(2) INFORMATION FOR SEQ ID NO: 17: 
(1) SEQUENCE CHARACTERISTICS: 

(A) LB*ZIH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STOANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLBCUI£ TYPE: otter nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DMA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TGCAGGCAAC TTTGGCATGC 

(2) INFORMATION FOR SEQ ID NO: 18: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOIBCULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc « "Synthetic DNA* 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
AGCAACACGA GCAAGATGAG GAGGATGACT 

(2) INFORMATION FOR SEQ ID NO: 19: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LfNCTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STOANDEENESS: single 

(D) TOPOLOGY: linear 

(ii) MDI£CUI£ TYPE: other nucleic acid 

(A) DESCRIPTION: /desc « "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 19: 
OCGGATOCTT GAAOOCTTCT TCGATTTC 

(2) INFORMATION FOR SEQ ID NO: 20: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTCH: 29 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MDI£CULE TOPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DNA n 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
COGGATCCAC GQCAATGCAT CTTGAAACC 

(2) INFORMATION FOR SEQ 3D NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) SraAM>EDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DNA n 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GGTTAGCATA TTGTTPCTAG CATTGGG 

(2) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRAMDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic ENA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
ATCAATGAAA TATGTATACT TCATAGC 

(2) INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTO: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc » "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
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ctttoottct Tnwrmu OCATOGC 

(2) INFUromTICN FOR SEQ ID NO: 24: 
(i) SEQUENCE CHARACTERISTICS: 

(A) ISKSEH: 27 basa pairs 

(B) TYPE: nucleic *rrifl 

(C) STFPmmXESS: single 

(D) TOFnTOSY: Hi»ar 

(ii) MDL5EULS TYPE: crtfcer nucleic »r*rt 

(A) VBSGOPnmz /riesc = "Syntfetic ENA° 
(si) SBgU©C2 DSOaPTICN: SEQ ID NO: 24: 
AaCTTTTTA O^AACAACBT AAATOOC 
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Claims 

1. A DNA coding for any one of the following proteins (A) to (C): 



(A) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID 
NO: 2 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino 
acids relevant to SEQ ID NO: 2; 

(B) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID 
NO: 4 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino 

30 acids relevant to SEQ ID NO: 4; and 

(C) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID 
NO: 8 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino 
acids relevant to SEQ ID NO: 8, and an amino acid sequence shown in SEQ ID NO: 11 or an amino acid 
sequence involving deletion, substitution, insertion, or addition of one or several amino acids relevant to SEQ 

3* ID NO: 11. 

2. A recombinant vector comprising all or a part of the DNA as defined in claim 1 . 

3. A transformed cell transformed with the DNA as defined in claim 1 . 



4. A method for controlling cellulose synthesis in a cell, comprising the steps of introducing the DNA as defined in 
claim 1 into the cell, and expressing RNA having a nucleotide sequence homologous to the DNA as defined in 
claim 1 or a nucleotide sequence complementary to the DNA as defined in claim 1. 



so 



55 



BNSDCCrO: <EP 087S57SA2 I > 



48 



EP 0 875 575 A2 



PcsA3 



>— •€ 

PCSA3-5* PcsA3-3* 



PcsA3-682 



FIG. 1 



SEQ ID NO: 14 

5' AATTCGGCACGAG 3' 

3' GCCGTGCTC 5' 



FIG. 2 
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10 20 30 40 50 60 

(SEQ ID ?0; 5) 

PcsA3 " 3 ' CC(^AnCCTGAA(5GAKXrrC<iAKTATGAAGAGAGMTATGAAGAATTCAAGGTTAG6 
(SEQID»: 9) 20 30 40 50 60 70 

70 80 90 100 no 120 

PcsA3-682 ATAAATIX^TTGTAOCCAAAfiWXWAAAGGnCCTCCAfiAAGMTGQATCATCCAAGAT 

PcsA3-3' ATAMTGCACTTGTAGCCAA^SCCCAAAAGGnCCTOCAGMGOGTG^TCATGCM 
80 90 100 110 120 130 

130 '40 150 160 170 180 



PcsA3-682 



G^X^ACCATGGCCAGCAA^CMTACTAAAGATCACCCTGGTATGArTCAAGTATTTCTC 



PcsA3-3' 6G6ACACCATGGC(^66AAACMTACTAAA8ATCAC<XTGGTATGAnCMCTATTTCTC 
140 «50 160 170 180 190 

190 200 210 220 230 240 

PcsA3-882 GGTCAAAGTGGA6GCCATGATACCGAAGGAAATGAGCTTCC7C6TCTCGTCTATGTATCT 

PC3A3-3' GGTCAAAGTGGAGGCCATWTACCGAAGGAMTGAeCTTCCTCGTCTCGTCTATGTATCT 
200 210 220 230 240 250 

250 260 270 280 290 300 

P«A3-682 CGAGAGAAMGGCCTGGTTTCn^ATCACAAGAAAGCTGGTfKCATGAACGCCCTTGTr 

PesA3-3- CGAGAGAAMGGCCAGGTTTCnGCATCACAAGAMGCTGGT(£CATGAACKCCTTGTT 
260 270 280 290 300 310 

310 320 330 340 350 360 

PcaA3-€82 CG^TCTCGGGGGTGCTCACAAATGCTCCTTnATGnGATWJTTGGATTGTGACCATTAT 

PcsA3-3- CGTGTCTOjGGffiTGCTTACAMTGCTttJTTnATGTTGAACTTGGATTGTG^ 

320 330 340 350 360 370 

370 380 390 400 410 420 

PcsA3-682 nAAATAACAGCAAGGCTGTAAGAGAGGCTATGTGTTTCTTGATGGACCCTCAAATTGGA 

PcsA3-3- nAMTAACA(»AAGaTGTAA6A(5AGGCTATGTGmcnGATGGACCCTCAAATTGGA 
380 390 400 410 420 430 

«0 440 450 460 470 480 



PcsA3-682 



AGAAAGGTTTGCTATGTCCAAnCCCTCAACGTTTCGATGGrATTGATAGACAiGATCGA 



PcsA3-3' AGAMGGTnGCTATGTCCMncXXJTCAACGTTTCGATGGTATTGATAGACATGATCGA 
440 450 460 470 480 490 

490 500 510 520 530 540 

PcsA3-682 TATGCCMTCGGAACACAGTTTTCTTTGATATTAACATGAAAGGTCTAGATGGTATACM 

PcsA3-3' TATGCCMTCGa\ACACAGTTrTCmWTAnAACATGAAAGGTCTAGATGGTATACAA 
500 510 520 530 S40 550 



FIG. 3 
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550 560 570 580 590 600 

PcsA3-682 GGCCCTGTATATGTCGGCAC6GGGTGTGTTTTCA6AAGGCAAGCTCTTTATGGTTAT6AA 
(SEQ ID KO: 5) 

PcsA3-3' GGCCCTGTATATGTCGGCACGGGGTGTGTTTTCA6AAG©}AAGCTCTTTATGGTTATGAA 
(SEQ ID KO: 9) 560 570 580 550 600 610 

610 620 630 640 650 660 

PcsA3-682 CCTCCAAA6GGACCTMGCGCCCGAAMTGGTAACCTGTG6TTGCTGC(XJnGTTTTGGA 



PcsA3-3' CCTCCAAAGGGACCTMGCGCCCGAAAATGGTAACCTGTGGTTGCTGCXCnGCT^ 
620 630 640 650 660 670 

670 680 690 700 710 720 

PcsA3-682 CGXGCAGAAA6GACAAAAAQCACTCTAAGGATGGTGGAAATGCAAATGGTCTAAGCCTA 



PcsA3-3* CGCCGCAGAAAGGACAAAAAGCACTCTAAGGATGGTGGAAATC^AAATGGTCTAAGCCTA 
680 690 700 710 720 730 

730 740 750 760 770 780 

PcsA3-682 GAAGCAGCC AAAGAT GAC AAGGAGTT ATT GATGTCCCACAT GMCTT TGAAAAGAAATTT 



PcsA3-3* GAAGCAGCCGAAGAT GACAAGGAGTTATTGATGTCCCACATGAACTTTGAAAAGAAATTT 
740 750 760 770 780 790 

790 800 810 820 830 840 

PcsA3-682 GGACAATCAGCCATTTTTGTAACTTCAACACTGATGGAACAAGGTGGTGTCCCTCCTTCT 



PcsA3-3' GGACMTCAGTCAnmGTAACnCAACACTGATGGAACAAGGTGGTGTCCCTCCnCT 
800 810 820 830 840 850 

850 860 870 880 890 900 

PcsA3-682 TCAAGCCCCGCAGCTTTGCTCAAAGAAGCCATTCATGTAATTAGTTGTGGTTAT6AAGAC 



Pc*A3-3' TCAAGCCCTGCAGCTTTGGTCAAAGAAGCCATTCATGTAATTAGTTGTGGTTAT6AAGAC 
860 870 880 890 900 910 

910 920 930 940 950 960 

PcsA3-682 AAAACAGMTGGC^AAKGAGCTTGGCTGGATTTACGGCTCXIATTACAGAAGATATCTTA 



PcsA3-3' AAAACCGMTGGGGAAGCGAGCTTGGCTffi^TTTACGGCTCGATTACAGAAGATATCTTA 
920 930 940 950 960 970 

970 980 
PcsA3-682 AC AGGATTCAAGATGCATTGCCGT GGAT 



PcsA3-3' ACAGGTTTCAAGATGCATTGCCGTGGAT 
980 990 1000 



FIG. 4 
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lose 4-p-glucosyftransferase (bcsA) of cellulose syn- 
thase operon of acetic acid bacterium were selected 
from obtained nucleotide sequences of the respective 
clones. Thus, DNA coding for cellulose synthase was 
obtained. 
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The Search Division considers that the present European patent application does not comply with the 
requirements of unity ol invention and relates to several inventions or groups of inventions, namely: 



1. Claims: 1-4 (partially) 

Claims 1- 4 (partially) refer to a DNA coding for a protein 
having a cellulose synthase activity and comprising an amino 
acid sequence shown in SEQ ID N0:2 or an amino acid sequence 
involving deletion, substitution, insertion, or addition of 
one or several amino acids relevant to SEQ ID N0:2, a 
recombinant vector comprising all or part of said DNA, a 
cell being transformed with said DNA , and a method for 
controlling cellulose synthesis in a cell by the use of said 
DNA. 



2. Claims: 1-4 (partially) 

Claims 1- 4 (partially) refer to a DNA coding for a protein 
having a cellulose synthase activity and comprising an amino 
acid sequence shown in SEQ ID NO: 4 or an amino acid sequence 
involving deletion, substitution, insertion, or addition of 
one or several amino acids relevant to SEQ ID N0:4, a 
recombinant vector comprising all or part of said DNA, a 
cell being transformed with said DNA, and a method for 
controlling cellulose synthesis in a cell by the use of said 
DNA. 



3. Claims: 1-4 (partially) 

Claims 1—4 (partially) refer to a DNA coding for a protein 
having a cellulose synthase activity and comprising an amino 
acid sequence shown in SEQ ID N0:8 and in SEQ ID NO: 11 or an 
amino acid sequence involving deletion, substitution, 
insertion, or addition of one or several amino acids 
relevant to SEQ ID N0:8 and/or SEQ ID NO: 11, a recombinant 
vector comprising all or part of said DNA, a cell being 
transformed with said DNA, and a method for controlling 
cellulose synthesis in a cell by the use of said DNA. 
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